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Streptococcus pneumoniae Antigens and Vaccines 

Field of the Invention 

The present invention relates to novel Streptococcus pneumoniae 
antigens for the detection of Streptococcus and for the prevention or attenuation 
of disease caused by Streptococcus. The invention further relates to isolated 
nucleic acid molecules encoding antigenic polypeptides of S. pneumoniae. 
Antigenic polypeptides are also provided, as are vectors, host cells and 
recombinant methods for producing the same. The invention additionally relates 
to diagnostic methods for detecting Streptococcus gene expression. 

Background of the Invention 

Streptococcus pneumoniae has been one of the most extensively studied 
microorganisms since its first isolation in 1881. It was the object of many 
investigations that led to important scientific discoveries. In 1928, Griffith 
observed that ^when heat-killed encapsulated pneumococci and live strains 
constitutively lacking any capsule were concomitantly injected into mice, the 
nonencapsulated could be converted into encapsulated pneumococci with the 
same capsular type as the heat-killed strain. Years later, the nature of this 
"transforming principle," or carrier of genetic information, was shown to be 
DNA. (Avery, O.T., et aL 7. Exp. Med., 79:137-157 (1944)). 

In spite of the vast number of publications on 5. pneumoniae many 
questions about its virulence are still unanswered, and this pathogen remains a 
major causative agent of serious human disease, especially community-acquired 
pneumonia. (Johnston, R.B., et aU Rev. Infect. Dis. ii(Suppl. 6):S509-517 
(1991)). In addition, in developing countries, the pneumococcus is responsible 
for the death of a large number of children under the age of 5 years from 
pneumococcal pneumonia. The incidence of pneumococcal disease is highest in 
infants under 2 years of age and in people over 60 years of age. Pneumococci 
are the second most frequent cause (after Haemophilus influenzae type b) of 
bacterial meningitis and otitis media in children. With the recent introduction of 
conjugate vaccines for H. influenzae type b, pneumococcal meningitis is likely 
to become increasingly prominent. S. pneumoniae is the most important 
etiologic agent of community-acquired pneumonia in adults and is the second 
most common cause of bacterial meningitis behind Neisseria meningitidis. 

The antibiotic generally prescribed to treat 5. pneumoniae is 
benzylpenicillin, although resistance to this and to other antibiotics is found 
occasionally. Pneumococcal resistance to penicillin results from mutations in its 
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penicillin-binding proteins. In uncomplicated pneumococcal pneumonia caused 
by a sensitive strain, treatment with penicillin is usually successful unless 
started too late. Erythromycin or clindamycin can be used to treat pneumonia in 
patients hypersensitive to penicillin, but resistant strains to these drugs exist. 
5 Broad spectrum antibiotics (e.g., the tetracyclines) may also be effective, 

although tetracycline-resistant strains are not rare. In spite of the availability of 
antibiotics, the mortality of pneumococcal bacteremia in the last four decades 
has remained stable between 25 and 29%. (Gillespie, S.H., et a/., J. Med. 
Microbiol 28:237-248 (1989). 
10 S. pneumoniae is carried in the upper respiratory tract by many healthy 

individuals. It has been suggested that attachment of pneumococci is mediated 
by a disaccharide receptor on fibronectin, present on human pharyngeal 
epithelial cells. (Anderson, B.J. , et a/., J. Immunol, 742:2464-2468 (1989). 
The mechanisms by which pneumococci translocate from the nasopharynx to 
15 the lung, thereby causing pneumonia, or migrate to the blood, giving rise to 

bacteremia or septicemia, are poorly understood. (Johnston, R.B., et a/., Rev. 
Infect. Dis. ii(Suppl. 6):S509-517 (1991). 

Various proteins have been suggested to be involved in the pathogenicity 
of S. pneumoniae, however, only a few of them have actually been confirmed 
20 as virulence factors. Pneumococci produce an IgAl protease that might 

interfere with host defense at mucosal surfaces. (Kornfield, S.J., et a/., Rev. 
Inf. Dis. 5:521-534 (1981). S. pneumoniae also produces neuraminidase, an 
enzyme that may facilitate attachment to epithelial cells by cleaving sialic acid 
from the host glycolipids and gangliosides. Partially purified neuraminidase 
25 was observed to induce meningitis-like symptoms in mice; however, the 

reliability of this finding has been questioned because the neuraminidase 
preparations used were probably contaminated with cell wall products. Other 
pneumococcal proteins besides neuraminidase are involved in the adhesion of 
pneumococci to epithelial and endothelial cells. These pneumococcal proteins 
30 have as yet not been identified. Recently, Cundell et a/., reported that peptide 

permeases can modulate pneumococcal adherence to epithelial and endothelial 
cells. It was, however, unclear whether these permeases function directly as 
adhesions or whether they enhance adherence by modulating the expression of 
pneumococcal adhesions. (DeVelasco, E.A., et a/.. Micro. Rev. 59:591-603 
35 (1995). A better understanding of the virulence factors determining its 

pathogenicity will need to be developed to cope with the devastating effects of 
pneumococcal disease in humans. 
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Ironically, despite the prominent role of 5. pneumoniae in the discovery 
of DNA, little is known about the molecular genetics of the organism. The S. 
pneumoniae genome consists of one circular, covalendy closed, double- 
stranded DNA and a collection of so-called variable accessory elements, such as 
prophages, plasmids, transposons and the like. Most physical characteristics 
and almost all of the genes of S. pneumoniae are unknown. Among the few 
that have been identified, most have not been physically mapped or 
characterized in detail. Only a few genes of this organism have been sequenced. 
(See, for instance current versions of GENBANK and other nucleic acid 
databases, and references that relate to the genome of S. pneumoniae such as 
those set out elsewhere herein.) Identification of in vivo-expressed, and 
broadly protective, antigens of S. pneumoniae has remained elusive. 

Summary of the Invention 

The present invention provides isolated nucleic acid molecules 
comprising polynucleotides encoding the 5. pneumoniae polypeptides described 
in Table 1 and having the amino acid sequences shown as SEQ ID NO:2, SEQ 
ID NO:4, SEQ ID NO:6 T and so on through SEQ ID NO:226. Thus, one aspect 
of the invention provides isolated nucleic acid molecules comprising 
polynucleotides having a nucleotide sequence selected from the group consisting 
of: (a) a nucleotide sequence encoding any of the amino acid sequences of the 
polypeptides shown in Table 1 ; and (b) a nucleotide sequence complementary to 
any of the nucleotide sequences in (a). 

Further embodiments of the invention include isolated nucleic acid 
molecules that comprise a polynucleotide having a nucleotide sequence at least 
90% identical, and more preferably at least 95%, 96%, 97%, 98% or 99% 
identical, to any of the nucleotide sequences in (a) or (b) above, or a 
polynucleotide which hybridizes under stringent hybridization conditions to a 
polynucleotide in (a) or (b) above. This polynucleotide which hybridizes does 
not hybridize under stringent hybridization conditions to a polynucleotide 
having a nucleotide sequence consisting of only A residues or of only T 
residues. Additional nucleic acid embodiments of the invention relate to isolated 
nucleic acid molecules comprising polynucleotides which encode the amino acid 
sequences of epitope-bearing portions of an S. pneumoniae polypeptide having 
an amino acid sequence in (a) above. 

The present invention also relates to recombinant vectors, which include 
the isolated nucleic acid molecules of the present invention, and to host cells 
containing the recombinant vectors, as well as to methods of making such 
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vectors and host cells and for using these vectors for the production of 5. 
pneumoniae polypeptides or peptides by recombinant techniques. 

The invention further provides isolated S. pneumoniae polypeptides 
having an amino acid sequence selected from the group consisting of an amino 
acid sequence of any of the polypeptides described in Table 1. 

The polypeptides of the present invention also include polypeptides 
having an amino acid sequence with at least 70% similarity, and more preferably 
at least 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% similarity to 
those described in Table 1, as well as polypeptides having an amino acid 
sequence at least 70% identical, more preferably at least 75% identical, and still 
more preferably 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identical to 
those above; as well as isolated nucleic acid molecules encoding such 
polypeptides. 

The present invention further provides a vaccine, preferably a 
multi-component vaccine comprising one or more of the S. pneumoniae 
polynucleotides, or polypeptides described in Table 1, or fragments thereof, 
together with a pharmaceutical^ acceptable diluent, carrier, or excipient, 
wherein the S. pneumoniae polypeptide(s) are present in an amount effective to 
elicit an immune response to members of the Streptococcus genus in an animal. 
The S. pneumoniae polypeptides of the present invention may further be 
combined with one or more immunogens of one or more other streptococcal or 
non-streptococcal organisms to produce a multi-component vaccine intended to 
elicit an immunological response against members of the Streptococcus genus 
and, optionally, one or more non-streptococcal organisms. 

The vaccines of the present invention can be administered in a DNA 
form, e.g., "naked" DNA, wherein the DNA encodes one or more streptococcal 
polypeptides and, optionally, one or more polypeptides of a non-streptococcal 
organism. The DNA encoding one or more polypeptides may be constructed 
such that these polypeptides are expressed fusion proteins. 

The vaccines of the present invention may also be administered as a 
component of a genetically engineered organism. Thus, a genetically 
engineered organism which expresses one or more S. pneumoniae polypeptides 
may be administered to an animal. For example, such a genetically engineered 
organism may contain one or more S. pneumoniae polypeptides of the present 
invention intracellular^, on its cell surface, or in its periplasmic space. Further, 
such a genetically engineered organism may secrete one or more S. pneumoniae 
polypeptides. 
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The vaccines of the present invention may be co-administered to an 
animal with an immune system modulator {e.g., CD86 and GM-CSF). 

The invention also provides a method of inducing an immunological 
response in an animal to one or more members of the Streptococcus genus, 
preferrably one or more isolates of the S. pneumoniae genus, comprising 
administering to the animal a vaccine as described above. 

The invention further provides a method of inducing a protective 
immune response in an animal, sufficient to prevent or attenuate an infection by 
members of the Streptococcus genus, preferrably at least 5. pneumoniae, 
comprising administering to the animal a composition comprising one or more 
of the polynucleotides or polypeptides described in Table 1, or fragments 
thereof. Further, these polypeptides, or fragments thereof, may be conjugated 
to another immunogen and/or administered in admixture with an adjuvant. 

The invention further relates to antibodies elicited in an animal by the 
administration of one or more 5. pneumoniae polypeptides of the present 
invention and ip methods for producing such antibodies. 

The invention also provides diagnostic methods for detecting the 
expression of genes of members of the Streptococcus genus in an animal. One 
such method involves assaying for the expression of a gene encoding 5. 
pneumoniae peptides in a sample from an animal. This expression may be 
assayed either directly (e.g., by assaying polypeptide levels using antibodies 
elicited in response to amino acid sequences described in Table 1 ) or indirectly 
(e.g., by assaying for antibodies having specificity for amino acid sequences 
described in Table 1). An example of such a method involves the use of the 
polymerase chain reaction (PCR) to amplify and detect Streptococcus nucleic 
acid sequences. 

The present invention also relates to nucleic acid probes having all or 
part of a nucleotide sequence described in Table 1 (shown as SEQ ID NO:l, 
SEQ ID NO:3, SEQ ID NO:5, and so on through SEQ ID NO:225) which are 
capable of hybridizing under stringent conditions to Streptococcus nucleic acids. 
The invention further relates to a method of detecting one or more Streptococcus 
nucleic acids in a biological sample obtained from an animal, said one or more 
nucleic acids encoding Streptococcus polypeptides, comprising: (a) contacting 
the sample with one or more of the above-described nucleic acid probes, under 
conditions such that hybridization occurs, and (b) detecting hybridization of said 
one or more probes to the Streptococcus nucleic acid present in the biological 
sample. 
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The invention also includes immunoassays, including an immunoassay 
for detecting Streptococcus, preferably at least isolates of the 5. pneumoniae 
genus, comprising incubation of a sample (which is suspected of being infected 
with Streptococcus) with a probe antibody directed against an antigen/epitope of 
5. pneumoniae, to be detected under conditions allowing the formation of an 
antigen-antibody complex; and detecting the antigen-antibody complex which 
contains the probe antibody. An immunoassay for the detection of antibodies 
which are directed against a Streptococcus antigen comprising the incubation of 
a sample (containing antibodies from a mammal suspected of being infected 
with Streptococcus) with a probe polypeptide including an epitope of S. 
pneumoniae , under conditions that allow the formation of antigen-antibody 
complexes which contain the probe epitope containing antigen. 

Some aspects of the invention pertaining to kits are those for: 
investigating samples for the presence of polynucleotides derived from 
Streptococcus which comprise a polynucleotide probe including a nucleotide 
sequence selected from Table 1 or a fragment thereof of approximately 1 5 or 
more nucleotides, in an appropriate container; analyzing the samples for the 
presence of antibodies directed against a Streptococcus antigen made up of a 
polypeptide which contains a S. pneumoniae epitope present in the polypeptide, 
in a suitable container; and analyzing samples for the presence of Streptococcus 
antigens made up of an anti-5. pneumoniae antibody, in a suitable container. 

Detailed Description 

The present invention relates to recombinant antigenic S. pneumoniae 
polypeptides and fragments thereof. The invention also relates to methods for 
using these polypeptides to produce immunological responses and to confer 
immunological protection to disease caused by members of the genus 
Streptococcus, at least isolates of the S. pneumoniae genus. The invention 
further relates to nucleic acid sequences which encode antigenic S. pneumoniae 
polypeptides and to methods for detecting S. pneumoniae nucleic acids and 
polypeptides in biological samples. The invention also relates to S. 
pneumoniae-specific antibodies and methods for detecting such antibodies 
produced in a host animal. 

Definitions 

The following definitions are provided to clarify the subject matter 
which the inventors consider to be the present invention. 
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As used herein, the phrase "pathogenic agent" means an agent which 
causes a disease state or affliction in an animal. Included within this definition, 
for examples, are bacteria, protozoans, fungi, viruses and metazoan parasites 
which either produce a disease state or render an animal infected with such an 
5 organism susceptible to a disease state (e.g., a secondary infection). Further 

included are species and strains of the genus Streptococcus which produce 
disease states in animals. 

As used herein, the term "organism" means any living biological 
system, including viruses, regardless of whether it is a pathogenic agent. 
10 As used herein, the term "Streptococcus" means any species or strain of 

bacteria which is members of the genus Streptococcus. Such species and 
strains are known to those of skill in the art, and include those that are 
pathogenic and those that are not. 

As used herein, the phrase "one or more S. pneumoniae polypeptides of 
15 the present invention" means polypeptides comprising the amino acid sequence 

of one or mor^ of the S. pneumoniae polypeptides described in Table 1 and 
disclosed as SEQ ID NO:2, SEQ ID NO:4. SEQ ID NO:6, and so on through 
SEQ ID NO:226. These polypeptides may be expressed as fusion proteins 
wherein the S. pneumoniae polypeptides of the present invention are linked to 
20 additional amino acid sequences which may be of streptococcal or non- 

streptococcal origin. This phrase further includes polypeptide comprising 
fragments of the S. pneumoniae polypeptides of the present invention. 

Additional definitions are provided throughout the specification. 

25 Explanation of Table 1 

Table 1, below, provides information describing 113 open reading 
frames (ORFs) which encode potentially antigenic polypeptides of 5. 
pneumoniae of the present invention. The table lists the ORF identifier which 
consists of the letters SP, which denote 5. pneumoniae, followed immediately 

30 by a three digit numeric code, which arbitrarily number the potentially antigenic 

polypeptides of S. pneumoniae of the present invention and the nucleotide or 
amino acid sequence of each ORF and encoded polypeptide. The table further 
correlates the ORF identifier with a sequence identification number (SEQ ID 
NO:). The actual nucleotide or amino acid sequence of each ORF identifier is 

35 also shown in the Sequence Listing under the corresponding SEQ ID NO. 

Thus, for example, the designation "SP126" refers to both the 
nucleotide and amino acid sequences of S. pneumoniae polypeptide number 126 
of the present invention. Further, "SP126" correlates with the nucleotide 
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sequence shown as SEQ ID NO:223 and with the amino acid sequence shown 
as SEQ ID NO:224 as is described in Table 1. 

The open reading frame within each "ORF" begins with the second 
nucleotide shown. Thus, the first codon for each nucleotide sequence shown is 
bases 2-4, the second 5-7, the third 8-10, and so on. 

Explanation of Table 2 

Table 2 lists the antigenic epitopes present in each of the S. pneumoniae 
polypeptides described in Table I as predicted by the inventors. Each 5. 
pneumoniae polypeptide shown in Table 1 has one or more antigenic epitopes 
described in Table 2. It will be appreciated that depending on the analytical 
criteria used to predict antigenic determinants, the exact address of the 
determinant may vary slightly. The exact location of the antigenic determinant 
may shift by about 1 to 5 residues, more likely 1 to 2 residues, depending on 
the criteria used. Thus, the first antigenic determinant described in Table 2, 
"Lys-l to He- 10" of SP001, represents a peptide comprising the lysine at 
position 1 in SEQ ID NO:2 through and including the isoleucine at position 10 
in SEQ ID NO:2, but may include more or fewer residues than those 10. It will 
also be appreciated that, generally speaking, amino acids can be added to either 
terminus of a peptide or polypeptide containing an antigenic epitope without 
affecting its activity, whereas removing residues from a peptide or polypeptide 
containing only the antigenic determinant is much more likely to destroy 
activity. It will be appreciated that the residues and locations shown described 
in Table 2 correspond to the amino acid sequences for each ORF shown in 
Table 1 and in the Sequence Listing. 

Explanation of Table 3 

Table 3 shows PCR primers designed by the inventors for the 
amplification of polynucleotides encoding polypeptides of the present invention 
according to the method of Example i. PCR primer design is routine in the art 
and those shown in Table 3 are provided merely for the convenience of the 
skilled artisan. It will be appreciated that others can be used with equal success. 

For each primer, the table lists the corresponding ORF designation from 
Table 1 followed by either an "A" or a "B". The "A" primers are the 5' primers 
and the "B" primers 3*. A restriction enzyme site was built into each primer to 
allow ease of cloning. The restriction enzyme which will recognize and cleave a 
sequence within each primer is shown in Table 3, as well, under the heading 
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"RE" for restriction enzyme. Finally the sequence identifier is shown in Table 3 
for each primer for easy correlation with the Sequence Listing. 

Selection of Nucleic Acid Sequences Encoding Antigenic S . 
pneumoniae Polypeptides 

The present invention provides a select number of ORFs from those 
presented in the fragments of the S. pneumoniae genome which may prove 
useful for the generation of a protective immune response. The sequenced 5. 
pneumoniae genomic DNA was obtained from a sub-cultured isolate of 5. 
pneumoniae Strain 7/87 14.8.91, which has been deposited at the American 
Type Culture Collection, as a convenience to those of skill in the art. The 5. 
pneumoniae isolate was deposited on October 10, 1996 at the ATCC, 12301 
Park Lawn Drive, Rockville, Maryland 20852. and given accession number 
55840. A genomic library constructed from DNA isolated from the S. 
pneumoniae isolate was also deposited at the ATCC on October 1 1, 1996 and 
given ATCC peposit No. 97755. A more complete listing of the sequence 
obtained from the S. pneumoniae genome may be found in co-pending U.S. 
Provisional Application Serial No. 60/029,960, filed 10/31/96, incorporated 
herein by reference in its entirety. Some ORFs contained in the subset of 
fragments of the 5. pneumoniae genome disclosed herein were derived through 
the use of a number of screening criteria detailed below. 

The selected ORFs do not consist of complete ORFs. Although a 
polypeptide representing a complete ORF may be the closest approximation of a 
protein native to an organism, it is not always preferred to express a complete 
ORF in a heterologous system. It may be challenging to express and purify a 
highly hydrophobic protein by common laboratory methods. Thus, the 
polypeptide vaccine candidates described herein may have been modified 
slightly to simplify the production of recombinant protein. For example, 
nucleotide sequences which encode highly hydrophobic domains, such as those 
found at the amino terminal signal sequence, have been excluded from some 
constructs used for in vitro expression of the polypeptides. Furthermore, any 
highly hydrophobic amino acid sequences occurring at the carboxy terminus 
have also been excluded from the recombinant expression constructs. Thus, in 
one embodiment, a polypeptide which represents a truncated or modified ORF 
may be used as an antigen. 

While numerous methods are known in the art for selecting potentially 
immunogenic polypeptides, many of the ORFs disclosed herein were selected 
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on the basis of screening all theoretical S. pneumoniae ORFs for several aspects 
of potential immunogenicity. One set of selection criteria are as follows: 

1 . Type I signal sequence: An amino terminal type I signal sequence 
generally directs a nascent protein across the plasma and outer membranes to the 
exterior of the bacterial cell. Experimental evidence obtained from studies with 
Escherichia coli suggests that the typical type I signal sequence consists of the 
following biochemical and physical attributes (Izard, J. W. and Kendall, D. A. 
MoL Microbiol. 13:765-773 (1994)). The length of the type I signal sequence 
is approximately 15 to 25 primarily hydrophobic amino acid residues with a net 
positive charge in the extreme amino terminus. In addition, the central region of 
the signal sequence adopts an alpha-helical conformation in a hydrophobic 
environment. Finally, the region surrounding the actual site of cleavage is 
ideally six residues long, with small side-chain amino acids in the -1 and -3 
positions. 

2. Type IV signal sequence: The type IV signal sequence is an example 
of the several types of functional signal sequences which exist in addition to the 
type I signal sequence detailed above. Although functionally related, the type 
IV signal sequence possesses a unique set of biochemical and physical attributes 
(Strom, M. S. and Lory, S., 7. BacterioL 174:7345-7351 (1992)). These are 
typically six to eight amino acids with a net basic charge followed by an 
additional sixteen to thirty primarily hydrophobic residues. The cleavage site of 
a type IV signal sequence is typically after the initial six to eight amino acids at 
the extreme amino terminus. In addition, type IV signal sequences generally 
contain a phenylalanine residue at the +1 site relative to the cleavage site, 

3. Lipoprotein: Studies of the cleavage sites of twenty-six bacterial 
lipoprotein precursors has allowed the definition of a consensus amino acid 
sequence for lipoprotein cleavage. Nearly three-fourths of the bacterial 
lipoprotein precursors examined contained the sequence L-(A,S)-(G ? A)-C at 
positions -3 to +1, relative to the point of cleavage (Hayashi, S. and Wu, H. 
C.,7. Bioenerg. Biomembr. 22:451-471 (1990)). 

4. LPXTG motif: It has been experimentally determined that most 
anchored proteins found on the surface of gram-positive bacteria possess a 
highly conserved carboxy terminal sequence. More than Fifty such proteins 
from organisms such as 5. pyogenes, S. mutans, E. faecalis, 5. pneumoniae, 
and others, have been identified based on their extracellular location and 
carboxy terminal amino acid sequence (Fischetti, V. A., ASM News 
62:405-410 (1996)). The conserved region consists of six charged amino acids 
at the extreme carboxy terminus coupled to 15-20 hydrophobic amino acids 



WO 98/18930 



PCT/US97/19422 



11 

presumed to function as a transmembrane domain. Immediately adjacent to the 
transmembrane domain is a six amino acid sequence conserved in nearly all 
proteins examined. The amino acid sequence of this region is L-P-X-T-G-X, 
where X is any amino acid. 

An algorithm for selecting antigenic and immunogenic S. pneumoniae 
polypeptides including the foregoing criteria was developed. Use of the 
algorithm by the inventors to select immunologically useful 5. pneumoniae 
polypeptides resulted in the selection of a number of the disclosed ORFs. 
Polypeptides comprising the polypeptides identified in this group may be 
produced by techniques standard in the art and as further described herein. 

Nucleic Acid Molecules 

The present invention provides isolated nucleic acid molecules 
comprising polynucleotides encoding the S. pneumoniae polypeptides having 
the amino acid sequences described in Table 1 and shown as SEQ ID NO:2, 
SEQ ID NO:4,;SEQ ID NO:6, and so on through SEQ ID NO:226, which were 
determined by sequencing the genome of 5. pneumoniae and selected as 
putative immunogens. 

Unless otherwise indicated, all nucleotide sequences determined by 
sequencing a DNA molecule herein were determined using an automated DNA 
sequencer (such as the Model 373 from Applied Biosystems, Inc.), and all 
amino acid sequences of polypeptides encoded by DNA molecules determined 
herein were predicted by translation of DNA sequences determined as above. 
Therefore, as is known in the art for any DNA sequence determined by this 
automated approach, any nucleotide sequence determined herein may contain 
some errors. Nucleotide sequences determined by automation are typically at 
least about 90% identical, more typically at least about 95% to at least about 
99.9% identical to the actual nucleotide sequence of the sequenced DNA 
molecule. The actual sequence can be more precisely determined by other 
approaches including manual DNA sequencing methods well known in the art. 
As is also known in the art, a single insertion or deletion in a determined 
nucleotide sequence compared to the actual sequence will cause a frame shift in 
translation of the nucleotide sequence such that the predicted amino acid 
sequence encoded by a determined nucleotide sequence will be completely 
different from the amino acid sequence actually encoded by the sequenced DNA 
molecule, beginning at the point of such an insertion or deletion. 

Unless otherwise indicated, each "nucleotide sequence" set forth herein 
is presented as a sequence of deoxy ribonucleotides (abbreviated A, G f C and 
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T). However, by "nucleotide sequence" of a nucleic acid molecule or 
polynucleotide is intended, for a DNA molecule or polynucleotide, a sequence 
of deoxyribonucleotides, and for an RNA molecule or polynucleotide, the 
corresponding sequence of ribonucleotides (A, G, C and U), where each 

5 thymidine deoxyribonucleotide (T) in the specified deoxyribonucleotide 

sequence is replaced by the ribonucleotide uridine (U). For instance, reference 
to an RNA molecule having a sequence described in Table 1 set forth using 
deoxyribonucleotide abbreviations is intended to indicate an RNA molecule 
having a sequence in which each deoxyribonucleotide A, G or C described in 

10 Table 1 has been replaced by the corresponding ribonucleotide A, G or C, and 

each deoxyribonucleotide T has been replaced by a ribonucleotide U. 

Nucleic acid molecules of the present invention may be in the form of 
RNA, such as mRNA, or in the form of DNA, including, for instance, cDNA 
and genomic DNA obtained by cloning or produced synthetically. The DNA 

15 may be double-stranded or single-stranded. Single-stranded DNA or RNA may 

be the coding ^strand, also known as the sense strand, or it may be the 
non-coding strand, also referred to as the anti-sense strand. 

By "isolated" nucleic acid molecule(s) is intended a nucleic acid 
molecule, DNA or RNA, which has been removed from its native environment. 

20 For example, recombinant DNA molecules contained in a vector are considered 

isolated for the purposes of the present invention. Further examples of isolated 
DNA molecules include recombinant DNA molecules maintained in 
heterologous host cells or purified (partially or substantially) DNA molecules in 
solution. Isolated RNA molecules include in vivo or in.vitro RNA transcripts of 

25 the DNA molecules of the present invention. Isolated nucleic acid molecules 

according to the present invention further include such molecules produced 
synthetically. 

Isolated nucleic acid molecules of the present invention include DNA 
molecules comprising a nucleotide sequence described in Table 1 and shown as 

30 SEQ ED NO:l, SEQ ID NO:3, SEQ ID NO:5, and so on through SEQ ID 

NO:225; DNA molecules comprising the coding sequences for the polypeptides 
described in Table 1 and shown as SEQ ID NO:2, SEQ ID NO:4, SEQ ID 
NO:6, and so on through SEQ ID NO:226; and DNA molecules which comprise 
sequences substantially different from those described above but which, due to 

35 the degeneracy of the genetic code, still encode the S. pneumoniae polypeptides 

described in Table 1 . Of course, the genetic code is well known in the art. 
Thus, it would be routine for one skilled in the art to generate such degenerate 
variants. 
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The invention also provides nucleic acid molecules having sequences 
complementary to any one of those described in Table 1. Such isolated 
molecules, particularly DNA molecules, are useful as probes for detecting 
expression of Streptococcal genes, for instance, by Northern blot analysis or the 
polymerase chain reaction (PCR). 

The present invention is further directed to fragments of the isolated 
nucleic acid molecules described herein. By a fragment of an isolated nucleic 
acid molecule having a nucleotide sequence described in Table 1, is intended 
fragments at least about 15 nt, and more preferably at least about 17 nt, still 
more preferably at least about 20 nt, and even more preferably, at least about 25 
nt in length which are useful as diagnostic probes and primers as discussed 
herein.. Of course, larger fragments 50-100 nt in length are also useful 
according to the present invention as are fragments corresponding to most, if 
not all, of a nucleotide sequence described in Table 1. By a fragment at least 20 
nt in length, for example, is intended fragments which include 20 or more 
contiguous basps of a nucleotide sequence as described in Table 1. Since the 
nucleotide sequences identified in Table 1 are provided as SEQ ID NO:l, SEQ 
ID NO:3, SEQ ID NO:5, and so on through SEQ ID NO:225, generating such 
DNA fragments would be routine to the skilled artisan. For example, such 
fragments could be generated synthetically. 

Preferred nucleic acid fragments of the present invention also include 
nucleic acid molecules comprising nucleotide sequences encoding 
epitope-bearing portions of the 5. pneumoniae polypeptides identified in Table 
1. Such nucleic acid fragments of the present invention include, for example, 
nucleotide sequences encoding polypeptide fragments comprising from about 
the amino terminal residue to about the carboxy terminal residue of each 
fragment shown in Table 2. The above referred to polypeptide fragments are 
antigenic regions of the 5. pneumoniae polypeptides identified in Table 1 . 

In another aspect, the invention provides isolated nucleic acid molecules 
comprising polynucleotides which hybridize under stringent hybridization 
conditions to a portion of a polynucleotide in a nucleic acid molecule of the 
invention described above, for instance, a nucleic acid sequence identified in 
Table 1. By "stringent hybridization conditions" is intended overnight 
incubation at 42*C in a solution comprising: 50% formamide, 5x SSC (150 mM 
NaCl, 15 mM trisodium citrate), 50 mM sodium phosphate (pH 7.6), 5x 
Denhardt's solution, \0% dextran sulfate, and 20 g/ml denatured, sheared 
salmon sperm DNA, followed by washing the filters in O.lx SSC at about 
65'C. 
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By polynucleotides which hybridize to a "portion" of a polynucleotide is 
intended polynucleotides (either DNA or RNA) which hybridize to at least about 
15 nucleotides (nt), and more preferably at least about 17 nt, still more 
preferably at least about 20 nt, and even more preferably about 25-70 nt of the 

5 reference polynucleotide. These are useful as diagnostic probes and primers as 

discussed above and in more detail below. 

Of course, polynucleotides hybridizing to a larger portion of the 
reference polynucleotide, for instance, a portion 50-100 nt in length, or even to 
the entire length of the reference polynucleotide, are also useful as probes 

10 according to the present invention, as are polynucleotides corresponding to 

most, if not all, of a nucleotide sequence as identified in Table 1 . By a portion 
of a polynucleotide of "at least 20 nt in length," for example, is intended 20 or 
more contiguous nucleotides from the nucleotide sequence of the reference 
polynucleotide (e.g., a nucleotide sequences as described in Table 1). As noted 

15 above, such portions are useful diagnostically either as probes according to 

conventional DNA hybridization techniques or as primers for amplification of a 
target sequence by PCR, as described in the literature (for instance, in Molecular 
Cloning, A Laboratory Manual, 2nd. edition, Sambrook, J., Fritsch, E. F. and 
Maniatis, T., eds., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, 

20 N.Y. (1989), the entire disclosure of which is hereby incorporated herein by 

reference). 

Since nucleic acid sequences encoding the S. pneumoniae polypeptides 
of the present invention are identified in Table 1 and provided as SEQ ID NO: 1, 
SEQ ID NO:3, SEQ ID NO:5, and so on through SEQ ID NO:225, generating 
25 polynucleotides which hybridize to portions of these sequences would be 

routine to the skilled artisan. For example, the hybridizing polynucleotides of 
the present invention could be generated synthetically according to known 
techniques. 

As indicated, nucleic acid molecules of the present invention which 
30 encode 5. pneumoniae polypeptides of the present invention may include, but 

are not limited to those encoding the amino acid sequences of the polypeptides 
by themselves; and additional coding sequences which code for additional 
amino acids, such as those which provide additional functionalities. Thus, the 
sequences encoding these polypeptides may be fused to a marker sequence, 
35 such as a sequence encoding a peptide which facilitates purification of the fused 

polypeptide. In certain preferred embodiments of this aspect of the invention, 
the marker amino acid sequence is a hexa-histidine peptide, such as the tag 
provided in a pQE vector (Qiagen, Inc.), among others, many of which are 



BNSOOC1D <WO 96ie930A2_l_> 



WO 98/18930 



PCT/US97/19422 



15 

commercially available. As described by Gentz and colleagues (Proc. Natl. 
Acad. ScL USA 86:821-824 (1989)), for instance, hexa-histidine provides for 
convenient purification of the resulting fusion protein. 

Thus, the present invention also includes genetic fusions wherein the 5. 

5 pneumoniae nucleic acid sequences coding sequences identified in Table 1 are 

linked to additional nucleic acid sequences to produce fusion proteins. These 
fusion proteins may include epitopes of streptococcal or non-streptococcal 
origin designed to produce proteins having enhanced immunogenicity. Further, 
the fusion proteins of the present invention may contain antigenic determinants 

10 known to provide helper T-cell stimulation, peptides encoding sites for 

post-translational modifications which enhance immunogenicity (e.g., 
acylation), peptides which facilitate purification (e.g., histidine "tag"), or amino 
acid sequences which target the fusion protein to a desired location (e.g., a 
heterologous leader sequence). 

15 In all cases of bacterial expression, an N-terminal methionine residues is 

added. In many cases, however, the N-terminal methionine residues is cleaved 
off post-translationaily. Thus, the invention includes polypeptides shown in 
Table 1 with, and without an N-termainal methionine. 

The present invention thus includes nucleic acid molecules and 

20 sequences which encode fusion proteins comprising one or more S. 

pneumoniae polypeptides of the present invention fused to an amino acid 
sequence which allows for post-translational modification to enhance 
immunogenicity. This post-translational modification may occur either in vitro 
or when the fusion protein is expressed in vivo in a host cell. An example of 

25 such a modification is the introduction of an amino acid sequence which results 

in the attachment of a lipid moiety. 

Thus, as indicated above, the present invention includes genetic fusions 
wherein a S. pneumoniae nucleic acid sequence identified in Table 1 is linked to 
a nucleotide sequence encoding another amino acid sequence. These other 

30 amino acid sequences may be of streptococcal origin (e.g., another sequence 

selected from Table 1) or non-streptococcal origin. 

The present invention further relates to variants of the nucleic acid 
molecules of the present invention, which encode portions, analogs or 
derivatives of the 5. pneumoniae polypeptides described in Table 1 . Variants 

35 may occur naturally, such as a natural allelic variant. By an "allelic variant" is 

intended one of several alternate forms of a gene occupying a given locus on a 
chromosome of an organism (Genes 11, Lewin, B., ed., John Wiley & Sons, 
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New York (1985)). Non-naturally occurring variants may be produced using 
art-known mutagenesis techniques. 

Such variants include those produced by nucleotide substitutions, 
deletions or additions. The substitutions, deletions or additions may involve 
one or more nucleotides. These variants may be altered in coding regions, 
non-coding regions, or both. Alterations in the coding regions may produce 
conservative or non-conservative amino acid substitutions, deletions or 
additions. Especially preferred among these are silent substitutions, additions 
and deletions, which do not alter the properties and activities of the 5. 
pneumoniae polypeptides disclosed herein or portions thereof. Silent 
substitution are most likely to be made in non-epitopic regions. Guidance 
regarding those regions containing epitopes is provided herein, for example, in 
Table 2. Also especially preferred in this regard are conservative substitutions. 

Further embodiments of the invention include isolated nucleic acid 
molecules comprising a polynucleotide having a nucleotide sequence at least 
90% identical v and more preferably at least 95%, 96%, 97%, 98% or 99% 
identical to: (a) a nucleotide sequence encoding any of the amino acid sequences 
of the polypeptides identified in Table 1; and (b) a nucleotide sequence 
complementary to any of the nucleotide sequences in (a) above. 

By a polynucleotide having a nucleotide sequence at least, for example, 
95% "identical" to a reference nucleotide sequence encoding a 5. pneumoniae 
polypeptide described in Table 1, is intended that the nucleotide sequence of the 
polynucleotide is identical to the reference sequence except that the 
polynucleotide sequence may include up to five point mutations per each 100 
nucleotides of the reference nucleotide sequence encoding the subject S. 
pneumoniae polypeptide. In other words, to obtain a polynucleotide having a 
nucleotide sequence at least 95% identical to a reference nucleotide sequence, up 
to 5% of the nucleotides in the reference sequence may be deleted or substituted 
with another nucleotide, or a number of nucleotides up to 5% of the total 
nucleotides in the reference sequence may be inserted into the reference 
sequence. These mutations of the reference sequence may occur at the 5' or 3* 
terminal positions of the reference nucleotide sequence or anywhere between 
those terminal positions, interspersed either individually among nucleotides in 
the reference sequence or in one or more contiguous groups within the reference 
sequence. 

Certain nucleotides within some of the nucleic acid sequences shown in 
Table 1 were ambiguous upon sequencing. Completely unknown sequences are 
shown as an "N". Other unresolved nucleotides are known to be either a 
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purine, shown as "R", or a pyrimidine, shown as "Y*\ Accordingly, when 
determining identity between two nucleotide sequences, identity is met where 
any nucleotide, including an "R", "Y" or "N", is found in a test sequence and at 
the corresponding position in the referece sequence (from Table 1). Likewise, 

5 an A, G or "R" in a test sequence is identical to an "R n in the reference 

sequence; and a T, C or "Y" in a test sequence is identical to a "Y" in the 
reference sequence. 

As a practical matter, whether any particular nucleic acid molecule is at 
least 90%, 95%, 96%, 97%, 98% or 99% identical to, for instance, a nucleotide 

10 sequence described in Table 1 can be determined conventionally using known 

computer programs such as the Bestfit program (Wisconsin Sequence Analysis 
Package, Version 8 for Unix, Genetics Computer Group, University Research 
Park, 575 Science Drive, Madison, WI 53711). Bestfit uses the local 
homology algorithm of Smith and Waterman (Advances in Applied Mathematics 

15 2:482-489 (1981)), to find the best segment of homology between two 

sequences. When using Bestfit or any other sequence alignment program to 
determine whether a particular sequence is, for instance, 95% identical to a 
reference sequence according to the present invention, the parameters are set, of 
course, such that the percentage of identity is calculated over the full length of 

20 the reference nucleotide sequence and that gaps in homology of up to 5% of the 

total number of nucleotides in the reference sequence are allowed. 

The present application is directed to nucleic acid molecules at least 
90%, 95%, 96%, 97%. 98% or 99% identical to a nucleic acid sequences 
described in Table 1. One of skill in the an would still know how to use the 

25 nucleic acid molecule, for instance, as a hybridization probe or a polymerase 

chain reaction (PCR) primer. Uses of the nucleic acid molecules of the present 
'invention include, inter alia, (I) isolating Streptococcal genes or allelic variants 
thereof from either a genomic or cDNA library and (2) Northern Blot or PCR 
analysis for detecting Streptococcal mRNA expression. 

30 Of course, due to the degeneracy of the genetic code, one of ordinary 

skill in the an will immediately recognize that a large number of nucleic acid 
molecules having a sequence at least 90%, 95%, 96%, 97%, 98%, or 99% 
identical to a nucleic acid sequence identified in Table 1 will encode the same 
polypeptide. In fact, since degenerate variants of these nucleotide sequences all 

35 encode the same polypeptide, this will be clear to the skilled anisan even 

without performing the above described comparison assay. 

It will be further recognized in the art that, for such nucleic acid 
molecules that are not degenerate variants, a reasonable number will also encode 
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proteins having antigenic epitopes of the S. pneumoniae polypeptides of the 
present invention. This is because the skilled artisan is fully aware of amino 
acid substitutions that are either less likely or not likely to significantly effect the 
antigenicity of a polypeptide {e.g., replacement of an amino acid in a region 
which is not believed to form an antigenic epitope). For example, since 
antigenic epitopes have been identified which contain as few as six amino acids 
(see Harlow, et al. t Antibodies: A Laboratory Manual. 2nd Ed.; Cold Spring 
Harbor Laboratory Press, Cold Spring Harbor, New York (1988), page 76), in 
instances where a polypeptide has multiple antigenic epitopes the alteration of 
several amino acid residues would often not be expected to eliminate all of the 
antigenic epitopes of that polypeptide. This is especially so when the alterations 
are in regions believed to not constitute antigenic epitopes. 

Vectors and Host Cells 

The present invention also relates to vectors which include the isolated 
DNA molecules of the present invention, host cells which are genetically 
engineered with the recombinant vectors, and the production of S. pneumoniae 
polypeptides or fragments thereof by recombinant techniques. 

Recombinant constructs may be introduced into host cells using well 
known techniques such as infection, transduction, transfection, transvection, 
electroporation and transformation. The vector may be, for example, a phage, 
plasmid, viral or retroviral vector. Retroviral vectors may be replication 
competent or replication defective. In the latter case, viral propagation generally 
will occur only in complementing host cells. 

The polynucleotides may be joined to a vector containing a selectable 
marker for propagation in a host. Generally, a plasmid vector is introduced in a 
precipitate, such as a calcium phosphate precipitate, or in a complex with a 
charged lipid. If the vector is a virus, it may be packaged in vitro using an 
appropriate packaging cell line and then transduced into host cells. 

Preferred are vectors comprising cw-acting control regions to the 
polynucleotide of interest. Appropriate rram-acting factors may be supplied by 
the host, supplied by a complementing vector or supplied by the vector itself 
upon introduction into the host. 

In certain preferred embodiments in this regard, the vectors provide for 
specific expression, which may be inducible and/or cell type-specific. 
Particularly preferred among such vectors are those inducible by environmental 
factors that are easy to manipulate, such as temperature and nutrient additives. 
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Expression vectors useful in the present invention include 
chromosomal-, episomal- and virus-derived vectors, e.g., vectors derived from 
bacterial plasmids, bacteriophage, yeast episomes, yeast chromosomal 
elements, viruses such as baculoviruses, papova viruses, vaccinia viruses, 

5 adenoviruses, fowl pox viruses, pseudorabies viruses and retroviruses, and 

vectors derived from combinations thereof, such as cosmids and phagemids. 

The DNA insert should be operatively linked to an appropriate promoter, 
such as the phage lambda PL promoter, the £. coli lac, trp and tac promoters, 
the SV40 early and late promoters and promoters of retroviral LTRs, to name a 

10 few. Other suitable promoters will be known to the skilled artisan. The 

expression constructs will further contain sites for transcription initiation, 
termination and, in the transcribed region, a ribosome binding site for 
translation. The coding portion of the mature transcripts expressed by the 
constructs will preferably include a translation initiating site at the beginning and 

15 a termination codon (UAA, UGA or UAG) appropriately positioned at the end 

of the polypeptide to be translated. 

As indicated, the expression vectors will preferably include at least one 
selectable marker. Such markers include dihydrofolate reductase or neomycin 
resistance for eukaryotic cell culture and tetracycline or ampicillin resistance 

20 genes for culturing in E. coli and other bacteria. Representative examples of 

appropriate hosts include, but are not limited to, bacterial cells, such as E. coli, 
Streptomyces and Salmonella ryphimurium cells; fungal cells, such as yeast 
cells; insect cells such as Drosophila S2 and Spodoptera Sf9 cells; animal cells 
such as CHO, COS and Bowes melanoma cells; and plant cells. Appropriate 

25 culture mediums and conditions for the above-described host cells are known in 

the art. 

Among vectors preferred for use in bacteria include pQE70, pQE60 and 
pQE-9, available from Qiagen; pBS vectors, Phagescript vectors, Bluescript 
vectors, pNH8A, pNH16a, pNH18A, pNH46A available from Stratagene; pET 

30 series of vectors available from Novagen; and ptrc99a, pKK223-3, pKK233-3, 

pDR540, pRIT5 available from Pharmacia. Among preferred eukaryotic 
vectors are pWLNEO, pSV2CAT, pOG44, pXTl and pSG available from 
Stratagene; and pSVK3, pBPV, pMSG and pSVL available from Pharmacia. 
Other suitable vectors will be readily apparent to the skilled artisan. 

35 Among known bacterial promoters suitable for use in the present 

invention include the E. coli lacl and lacZ promoters, the T3 and T7 promoters, 
the gpt promoter, the lambda PR and PL promoters and the trp promoter. 
Suitable eukaryotic promoters include the CMV immediate early promoter, the 
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HSV thymidine kinase promoter, the early and late SV40 promoters, the 
promoters of retroviral LTRs, such as those of the Rous sarcoma virus (RS V), 
and metallothionein promoters, such as the mouse metallothionein-I promoter. 
Introduction of the construct into the host cell can be effected by calcium 

5 phosphate transfection, DEAE-dextran mediated transfection, cationic 

lipid-mediated transfection, electroporation, transduction, infection or other 
methods. Such methods are described in many standard laboratory manuals 
(for example, Davis, et al, Basic Methods In Molecular Biology (1986)). 

Transcription of DNA encoding the polypeptides of the present 

10 invention by higher eukaryotes may be increased by inserting an enhancer 

sequence into the vector. Enhancers are cw-acting elements of DNA, usually 
about from 10 to 300 bp that act to increase transcriptional activity of a promoter 
in a given host cell-type. Examples of enhancers include the SV40 enhancer, 
which is located on the late side of the replication origin at bp 100 to 270, the 

15 cytomegalovirus early promoter enhancer, the polyoma enhancer on the late side 

of the replication origin, and adenovirus enhancers. 

For secretion of the translated polypeptide into the lumen of the 
endoplasmic reticulum, into the periplasmic space or into the extracellular 
environment, appropriate secretion signals may be incorporated into the 

20 expressed polypeptide. The signals may be endogenous to the polypeptide or 

they may be heterologous signals. 

The polypeptide may be expressed in a modified form, such as a fusion 
protein, and may include not only secretion signals, but also additional 
heterologous functional regions. For instance, a region of additional amino 

25 acids, particularly charged amino acids, may be added to the N-terminus of the 

polypeptide to improve stability and persistence in the host cell, during 
purification, or during subsequent handling and storage. Also, peptide moieties 
may be added to the polypeptide to facilitate purification. Such regions may be 
removed prior to final preparation of the polypeptide. The addition of peptide 

30 moieties to polypeptides to engender secretion or excretion, to improve stability 

and to facilitate purification, among others, are familiar and routine techniques 
in the art. A preferred fusion protein comprises a heterologous region from 
immunoglobulin that is useful to solubilize proteins. For example, EP-A-0 464 
533 (Canadian counterpart 2045869) discloses fusion proteins comprising 

35 various portions of constant region of immunoglobin molecules together with 

another human protein or part thereof. In many cases, the Fc part in a fusion 
protein is thoroughly advantageous for use in therapy and diagnosis and thus 
results, for example, in improved pharmacokinetic properties (EP-A 0232 262). 
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On the other hand, for some uses it would be desirable to be able to delete the 
Fc pan after the fusion protein has been expressed, detected and purified in the 
advantageous manner described. This is the case when Fc portion proves to be 
a hindrance to use in therapy and diagnosis, for example when the fusion 
protein is to be used as antigen for immunizations. In drug discovery, for 
example, human proteins, such as, hIL5-receptor has been fused with Fc 
portions for the purpose of high-throughput screening assays to identify 
antagonists of hIL-5. See Bennett, D. etal, 7. Molec. Recogn. 5:52-58 (1995) 
and Johanson, K. et al y 7. Biol. Chem, 270 (16):9459-941l (1995). 

The 5. pneumoniae polypeptides can be recovered and purified from 
recombinant cell cultures by well-known methods including ammonium sulfate 
or ethanol precipitation, acid extraction, anion or cation exchange 
chromatography, phosphocellulose chromatography, hydrophobic interaction 
chromatography, affinity chromatography, hydroxylapatite chromatography, 
lectin chromatography and high performance liquid chromatography CHPLC") 
is employed f<?r purification. Polypeptides of the present invention include 
naturally purified products, products of chemical synthetic procedures, and 
products produced by recombinant techniques from a prokaryotic or eukaryotic 
host, including, for example, bacterial, yeast, higher plant, insect and 
mammalian cells. 

Polypeptides and Fragments 

The invention further provides isolated polypeptides having the amino 
acid sequences described in Table 1, and shown as SEQ ID NO:2, SEQ ID 
NO:4, SEQ ID NO:6, and so on through SEQ ID NO:226, and peptides or 
polypeptides comprising portions of the above polypeptides. The terms 
"peptide" and "oligopeptide" are considered synonymous (as is commonly 
recognized) and each term can be used interchangeably as the context requires to 
indicate a chain of at least two amino acids coupled by peptidyl linkages. The 
word "polypeptide" is used herein for chains containing more than ten amino 
acid residues. All oligopeptide and polypeptide formulas or sequences herein 
are written from left to right and in the direction from amino terminus to carboxy 
terminus. 

Some amino acid sequences of the S. pneumoniae polypeptides 
described in Table 1 can be varied without significantly effecting the antigenicity 
of the polypeptides. If such differences in sequence are contemplated, it should 
be remembered that there will be critical areas on the polypeptide which 
determine antigenicity. In general, it is possible to replace residues which do 
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not form part of an antigenic epitope without significantly effecting the 
antigenicity of a polypeptide. Guidance for such alterations is given in Table 2 
wherein epitopes for each polypeptide is delineated. 

The polypeptides of the present invention are preferably provided in an 
isolated form. By "isolated polypeptide" is intended a polypeptide removed 
from its native environment. Thus, a polypeptide produced and/or contained 
within a recombinant host cell is considered isolated for purposes of the present 
invention. Also intended as an "isolated polypeptide" is a polypeptide that has 
been purified, partially or substantially, from a recombinant host cell. For 
example, recombinantly produced versions of the S. pneumonias polypeptides 
described in Table 1 can be substantially purified by the one-step method 
described by Smith and Johnson {Gene 67:31-40 (1988)). 

The polypeptides of the present invention include: (a) an amino acid 
sequence of any of the polypeptides described in Table 1; and (b) an amino acid 
sequence of an epitope-bearing portion of any one of the polypeptides of (a); as 
well as polypeptides with at least 70% similarity, and more preferably at least 
75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% similarity to those 
described in (a) or (b) above, as well as polypeptides having an amino acid 
sequence at least 70% identical, more preferably at least 75% identical, and still 
more preferably 80%, 85%, 90%, 95%. 96%, 97%, 98%, or 99% identical to 
those above. 

By "% similarity" for two polypeptides is intended a similarity score 
produced by comparing the amino acid sequences of the two polypeptides using 
the Bestfit program (Wisconsin Sequence Analysis Package. Version 8 for 
Unix, Genetics Computer Group, University Research Park, 575 Science 
Drive, Madison, WI 5371 1) and the default settings for determining similarity. 
Bestfit uses the local homology algorithm of Smith and Waterman {Advances in 
Applied Mathematics 2:482-489 (1981)) to find the best segment of similarity 
between two sequences. 

By a polypeptide having an amino acid sequence at least, for example, 
95% "identical" to a reference amino acid sequence of a S. pneumoniae 
polypeptide is intended that the amino acid sequence of the polypeptide is 
identical to the reference sequence except that the polypeptide sequence may 
include up to five amino acid alterations per each 100 amino acids of the 
reference amino acid sequence. In other words, to obtain a polypeptide having 
an amino acid sequence at least 95% identical to a reference amino acid 
sequence, up to 5% of the amino acid residues in the reference sequence may be 
deleted or substituted with another amino acid, or a number of amino acids up to 
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5% of the total amino acid residues in the reference sequence may be inserted 
into the reference sequence. These alterations of the reference sequence may 
occur at the amino or carboxy terminal positions of the reference amino acid 
sequence or anywhere between those terminal positions, interspersed either 
individually among residues in the reference sequence or in one or more 
contiguous groups within the reference sequence. 

The amino acid sequences shown in Table 1 may have on or more "X" 
residues. "X" represents unknown. Thus, for purposes of defining identity, if 
any amino acid is present at the same position in a reference amino acid 
sequence (shown in Table 1) where an X is shown, the two sequences are 
identical at that position. 

As a practical matter, whether any particular polypeptide is at least 70%, 
75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identical to, for 
instance, an amino acid sequence shown in Table 1, can be determined 
conventionally using known computer programs such the Bestfit program 
(Wisconsin Sequence Analysis Package, Version 8 for Unix, Genetics 
Computer Group, University Research Park, 575 Science Drive, Madison, WI 
53711). When using Bestfit or any other sequence alignment program to 
determine whether a particular sequence is, for instance, 95% identical to a 
reference sequence according to the present invention, the parameters are set, of 
course, such that the percentage of identity is calculated over the full length of 
the reference amino acid sequence and that gaps in homology of up to 5% of the 
total number of amino acid residues in the reference sequence are allowed. 

As described below, the polypeptides of the present invention can. also 
be used to raise polyclonal and monoclonal antibodies, which are useful in 
assays for detecting Streptococcal protein expression. 

In another aspect, the invention provides peptides and polypeptides 
comprising epitope-bearing portions of the S. pneumoniae polypeptides of the 
invention. These epitopes are immunogenic or antigenic epitopes of the 
polypeptides of the invention. An "immunogenic epitope" is defined as a part of 
a protein that elicits an antibody response when the whole protein or polypeptide 
is the immunogen. These immunogenic epitopes are believed to be confined to 
a few loci on the molecule. On the other hand, a region of a protein molecule to 
which an antibody can bind is defined as an "antigenic determinant" or 
"antigenic epitope." The number of immunogenic epitopes of a protein 
generally is less than the number of antigenic epitopes (Gey sen, et aL Proc. 
Natl. Acad ScL USA 81:3998- 4002 (1983)). Predicted antigenic epitopes are 
shown in Table 2, below. 
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As to the selection of peptides or polypeptides bearing an antigenic 
epitope (i.e., that contain a region of a protein molecule to which an antibody 
can bind), it is well known in that an that relatively short synthetic peptides that 
mimic part of a protein sequence are routinely capable of eliciting an antiserum 

5 that reacts with the partially mimicked protein (for instance, Sutcliffe, J., et al. 

Science 219:660-666 (1983)). Peptides capable of eliciting protein-reactive 
sera are frequently represented in the primary sequence of a protein, can be 
characterized by a set of simple chemical rules, and are confined neither to 
immunodominant regions of intact proteins (i.e., immunogenic epitopes) nor to 

10 the amino or carboxyl terminals. Peptides that are extremely hydrophobic and 

those of six or fewer residues generally are ineffective at inducing antibodies 
that bind to the mimicked protein; longer, peptides, especially those containing 
proline residues, usually are effective (Sutcliffe, et ai, supra, p. 661). For 
instance, 18 of 20 peptides designed according to these guidelines, containing 

15 8-39 residues covering 75% of the sequence of the influenza virus 

hemagglutinin HA 1 polypeptide chain, induced antibodies that reacted with the 
HA1 protein or intact virus; and 12/12 peptides from the MuLV polymerase and 
18/18 from the rabies glycoprotein induced antibodies that precipitated the 
respective proteins. 

20 Antigenic epitope-bearing peptides and polypeptides of the invention are 

therefore useful to raise antibodies, including monoclonal antibodies, that bind 
specifically to a polypeptide of the invention. Thus, a high proportion of 
hybridomas obtained by fusion of spleen cells from donors immunized with an 
antigen epitope-bearing peptide generally secrete antibody reactive with the 

25 native protein (Sutcliffe, et al. y supra, p. 663). The antibodies raised by 

antigenic epitope-bearing peptides or polypeptides are useful to detect the 
mimicked protein, and antibodies to different peptides may be used for tracking 
the fate of various regions of a protein precursor which undergoes 
post-translational processing. The peptides and anti-peptide antibodies may be 

30 used in a variety of qualitative or quantitative assays for the mimicked protein, 

for instance in competition assays since it has been shown that even short 
peptides (e.g., about 9 amino acids) can bind and displace the larger peptides in 
immunoprecipitation assays (for instance, Wilson, et a/.. Cell 37:767-778 
(1984) p. 777). The anti-peptide antibodies of the invention also are useful for 

35 purification of the mimicked protein, for instance, by adsorption 

chromatography using methods well known in the art. 

Antigenic epitope-bearing peptides and polypeptides of the invention 
designed according to the above guidelines preferably contain a sequence of at 
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least seven, more preferably at least nine and most preferably between about 15 
to about 30 amino acids contained within the amino acid sequence of a 
polypeptide of the invention. However, peptides or polypeptides comprising a 
larger portion of an amino acid sequence of a polypeptide of the invention, 
containing about 30 to about 50 amino acids, or any length up to and including 
the entire amino acid sequence of a polypeptide of the invention, also are 
considered epitope-bearing peptides or polypeptides of the invention and also 
are useful for inducing antibodies that react with the mimicked protein. 
Preferably, the amino acid sequence of the epitope-bearing peptide is selected to 
provide substantial solubility in aqueous solvents (i.e., the sequence includes 
relatively hydrophilic residues and highly hydrophobic sequences are preferably 
avoided); and sequences containing proline residues are particularly preferred. 

Non-limiting examples of antigenic polypeptides or peptides that can be 
used to generate Streptococcal-specific antibodies include portions of the amino 
acid sequences identified in Table 1. More specifically, Table 2 discloses 
antigenic fragments of polypeptides of the present invention, which antigenic 
fragments comprise amino acid sequences from about the first amino acid 
residues indicated to about the last amino acid residue indicated for each 
fragment. The polypeptide fragments disclosed in Table 2 are believed to be 
antigenic regions of the S. pneumoniae polypeptides described in Table 1 . Thus 
the invention further includes isolated peptides and polypeptides comprising an 
amino acid sequence of an -epitope shown in Table 2 and polynucleotides 
encoding said polypeptides. 

The epitope-bearing peptides and polypeptides of the invention may be 
produced by any conventional means for making peptides or polypeptides 
including recombinant means using nucleic acid molecules of the invention. For 
instance, an epitope-bearing amino acid sequence of the present invention may 
be fused to a larger polypeptide which acts as a carrier during recombinant 
production and purification, as well as during immunization to produce 
anti-peptide antibodies. Epitope-bearing peptides also may be synthesized using 
known methods of chemical synthesis. For instance, Houghten has described a 
simple method for synthesis of large numbers of peptides, such as 10-20 mg of 
248 different 13 residue peptides representing single amino acid variants of a 
segment of the HA1 polypeptide which were prepared and characterized (by 
ELISA-type binding studies) in less than four weeks (Houghten, R. A. Proc. 
Natl. Acad. Sci. USA 82:5131-5135 (1985)). This "Simultaneous Multiple 
Peptide Synthesis (SMPS)" process is further described in U.S. Patent No. 
4,63 1,21 1 to Houghten and coworkers (1986). In this procedure the individual 



WO 98/18930 



PCT/US97/19422 



26 

resins for the solid-phase synthesis of various peptides are contained in separate 
solvent-permeable packets, enabling the optimal use of the many identical 
repetitive steps involved in solid-phase methods. A completely manual 
procedure allows 500-1000 or more syntheses to be conducted simultaneously 

5 (Houghten, et al, supra, p. 5134). 

Epitope-bearing peptides and polypeptides of the invention are used to 
induce antibodies according to methods well known in the art (for instance, 
Sutcliffe, et a/., supra; Wilson, et al., supra; Chow, M., et al t Proc. Natl 
Acad, Set. USA 82:910-914; and Bittle, F. J., et aL I Gen, Virol. 

10 66:2347-2354 (1985)). Generally, animals may be immunized with free 

peptide; however, anti-peptide antibody titer may be boosted by coupling of the 
peptide to a macromolecular carrier, such as keyhole limpet hemacyanin (KLH) 
or tetanus toxoid. For instance, peptides containing cysteine may be coupled to 
carrier using a linker such as m-maleimidobenzoyl-N-hydroxysuccinimide ester 

15 (MBS), while other peptides may be coupled to carrier using a more general 

linking agent such as glutaraldehyde. Animals such as rabbits, rats and mice are 
immunized with either free or carrier-coupled peptides, for instance, by 
intraperitoneal and/or intradermal injection of emulsions containing about 100 
jig peptide or carrier protein and Freund's adjuvant. Several booster injections 

20 may be needed, for instance, at intervals of about two weeks, to provide a 

useful titer of anti-peptide antibody which can be detected, for example, by 
ELISA assay using free peptide adsorbed to a solid surface. The titer of 
anti-peptide antibodies in serum from an immunized animal may be increased by 
selection of anti-peptide antibodies, for instance, by adsorption to the peptide on 

25 a solid support and elution of the selected antibodies according to methods well 

known in the art. 

Immunogenic epitope-bearing peptides of the invention, i.e., those parts 
of a protein that elicit an antibody response when the whole protein is the 
immunogen, are identified according to methods known in the art. For 

30 instance, Geysen, et a/., supra, discloses a procedure for rapid concurrent 

synthesis on solid supports of hundreds of peptides of sufficient purity to react 
in an enzyme-linked immunosorbent assay. Interaction of synthesized peptides 
with antibodies is then easily detected without removing them from the support. 
In this manner a peptide bearing an immunogenic epitope of a desired protein 

35 may be identified routinely by one of ordinary skill in the art. For instance, the 

immunologically important epitope in the coat protein of foot-and-mouth disease 
virus was located by Geysen et al supra with a resolution of 5even amino acids 
by synthesis of an overlapping set of all 208 possible hexapeptides covering the 



BNSDOCID <WQ 961BS30A2 I > 



WO 98/18930 



PCT/US97/19422 



27 

entire 213 amino acid sequence of the protein. Then, a complete replacement set 
of peptides in which all 20 amino acids were substituted in turn at every position 
within the epitope were synthesized, and the particular amino acids conferring 
specificity for the reaction with antibody were determined. Thus, peptide 
analogs of the epitope-bearing peptides of the invention can be made routinely 
by this method. U.S. Patent No. 4,708,781 to Geysen (1987) further describes 
this method of identifying a peptide bearing an immunogenic epitope of a 
desired protein. 

Further still, U.S. Patent No. 5,194,392, to Geysen (1990), describes a 
general method of detecting or determining the sequence of monomers (amino 
acids or other compounds) which is a topological equivalent of the epitope (i.e., 
a "mimotope") which is complementary to a particular paratope (antigen binding 
site) of an antibody of interest. More generally, U.S. Patent No. 4,433,092, 
also to Geysen (1989), describes a method of detecting or determining a 
sequence of monomers which is a topographical equivalent of a ligand which is 
complementary to the ligand binding site of a particular receptor of interest. 
Similarly, U.s' Patent No. 5,480,971 to Houghten, R. A. et al. (1996) 
discloses linear C -C -alkyl peralkylated oligopeptides and sets and libraries of 
such peptides, as well as methods for using such oligopeptide sets and libraries 
for determining the sequence of a peralkylated oligopeptide that preferentially 
binds to an acceptor molecule of interest. Thus, non-peptide analogs of the 
epitope-bearing peptides of the invention also can be made routinely by these 
methods. 

The entire disclosure of each document cited in this section on 
"Polypeptides and Fragments" is hereby incorporated herein by reference. 

As one of skill in the an will appreciate, the polypeptides of the present 
invention and the epitope-bearing fragments thereof described above can be 
combined with parts of the constant domain of immunoglobulins (IgG), 
resulting in chimeric polypeptides. These fusion proteins facilitate purification 
and show an increased half-life in vivo. This has been shown, e.g., for 
chimeric proteins consisting of the first two domains of the human 
CD4-polypeptide and various domains of the constant regions of the heavy or 
light chains of mammalian immunoglobulins (EPA 0,394,827; Traunecker et 
aL, Nature 537:84-86 (1988)). Fusion proteins that have a disulfide-linked 
dimeric structure due to the IgG pan can also be more efficient in binding and 
neutralizing other molecules than a monomeric S. pneumoniae polypeptide or 
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fragment thereof alone (Fountoulakis et al, J. Biochem. 270:3958-3964 
(1995)). 

Diagnostic Assays 

5 The present invention further relates to a method for assaying for 

Streptococcal infection in an animal via detecting the expression of genes 
encoding Streptococcal polypeptides {e.g., the polypeptides described Table 1 ). 
This method comprises analyzing tissue or body fluid from the animal for 
Streptococcus-specific antibodies or Streptococcal nucleic acids or proteins. 

10 Analysis of nucleic acid specific to Streptococcus can be done by PCR or 

hybridization techniques using nucleic acid sequences of the present invention 
as either hybridization probes or primers (cf. Molecular Cloning: A Laboratory 
Manual, second edition, edited by Sambrook, Fritsch, & Maniatis, Cold Spring 
Harbor Laboratory, 1989; Eremeeva et ai, J. Clin. Microbiol. 52:803-810 

15 (1994) which describes differentiation among spotted fever group Rickettsiae 

species by analysis of restriction fragment length polymorphism of 
PCR-amplified DNA). Methods for detecting B. burgdorferi nucleic acids via 
PCR are described, for example, in Chen et ai, J. Clin. Microbiol. 52:589-595 
(1994). 

20 Where diagnosis of a disease state related to infection with 

Streptococcus has already been made, the present invention is useful for 
monitoring progression or regression of the disease state whereby patients 
exhibiting enhanced Streptococcus gene expression will experience a worse 
clinical outcome relative to patients expressing these gene(s) at a lower level. 

25 By "assaying for Streptococcal infection in an animal via detection of 

genes encoding Streptococcal polypeptides" is intended qualitatively or 
quantitatively measuring or estimating the level of one or more Streptococcus 
polypeptides or the level of nucleic acid encoding Streptococcus polypeptides in 
a first biological sample either directly (e.g., by determining or estimating 

30 absolute protein level or nucleic level) or relatively (e.g., by comparing to the 

Streptococcus polypeptide level or mRNA level in a second biological sample). 
The Streptococcus polypeptide level or nucleic acid level in the second sample 
used for a relative comparison may be undetectable if obtained from an animal 
which is not infected with Streptococcus. When monitoring the progression or 

35 regression of a disease state, the Streptococcus polypeptide level or nucleic acid 

level may be compared to a second sample obtained from either an animal 
infected with Streptococcus or the same animal from which the first sample was 
obtained but taken from that animal at a different time than the first. As will be 



SNSOOCIO <WO 96ie330A2J.> 



WO 98/18930 



PCMJS97/19422 



29 

appreciated in the art, once a standard Streptococcus polypeptide level or nucleic 
acid level which corresponds to a particular stage of a Streptococcus infection is 
known, it can be used repeatedly as a standard for comparison. 

By "biological sample" is intended any biological sample obtained from 
an animal, cell line, tissue culture, or other source which contains Streptococcus 
polypeptide, mRNA, or DNA. Biological samples include body fluids (such as 
plasma and synovial fluid) which contain Streptococcus polypeptides, and 
muscle, skin, and cartilage tissues. Methods for obtaining tissue biopsies and 
body fluids are well known in the art. 

The present invention is useful for detecting diseases related to 
Streptococcus infections in animals. Preferred animals include monkeys, apes, 
cats, dogs, cows, pigs, mice, horses, rabbits and humans. Particularly 
preferred are humans. 

Total RNA can be isolated from a biological sample using any suitable 
technique such as the single-step guanidinium-thiocyanate-phenol-chloroform 
method described in Chomczynski and Sacchi, Anal. Biochem. 762:156-159 
(1987). mRNA encoding Streptococcus polypeptides having sufficient 
homology to the nucleic acid sequences identified in Table 1 to allow for 
hybridization between complementary sequences are then assayed using any 
appropriate method. These include Northern blot analysis, SI nuclease 
mapping, the polymerase chain reaction (PCR), reverse transcription in 
combination with the polymerase chain reaction (RT-PCR), and reverse 
transcription in combination with the ligase chain reaction (RT-LCR). 

Northern blot analysis can be performed as described in Harada et ai, 
Cell 65:303-312 (1990). Briefly, total RNA is prepared from a biological 
sample as described above. For the Northern blot, the RNA is denatured in an 
appropriate buffer (such as glyoxal/dimethyl sulfoxide/sodium phosphate 
buffer), subjected to agarose gel electrophoresis, and transferred onto a 
nitrocellulose filter. After the RNAs have been linked to the filter by a UV 
linker, the filter is prehybridized in a solution containing formamide, SSC, 
Denhardt's solution, denatured salmon sperm, SDS, and sodium phosphate 

buffer. A S. pnuemoniae polypeptide DNA sequence shown in Table 1 labeled 

v 

according to any appropriate method (such as the "P-muJtiprimed DNA labeling 
system (Amersham)) is used as probe. After hybridization overnight, the filter 
is washed and exposed to x-ray film. DNA for use as probe according to the 
present invention is described in the sections above and will preferably at least 
15 bp in length. 
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SI mapping can be performed as described in Fujita et aL Cell 
49:151-361 (1987). To prepare probe DNA for use in SI mapping, the sense 
strand of an above-described 5. pnuemoniae DNA sequence of the present 
invention is used as a template to synthesize labeled antisense DNA. The 
antisense DNA can then be digested using an appropriate restriction 
endonuclease to generate further DNA probes of a desired length. Such 
antisense probes are useful for visualizing protected bands corresponding to the 
target mRNA (i.e M mRNA encoding Streptococcus polypeptides). 

Preferably, levels of mRNA encoding Streptococcus polypeptides are 
assayed using the RT-PCR method described in Makino et a/., 
Technique 2:295-301 (1990). By this method, the radioactivities of the 
"amplicons" in the polyacrylamide gel bands are linearly related to the initial 
concentration of the target mRNA. Briefly, this method involves adding total 
RNA isolated from a biological sample in a reaction mixture containing a RT 
primer and appropriate buffer. After incubating for primer annealing, the 
mixture can be supplemented with a RT buffer, dNTPs, DTT. RNase inhibitor 
and reverse transcriptase. After incubation to achieve reverse transcription of 
the RNA, the RT products are then subject to PCR using labeled primers. 
Alternatively, rather than labeling the primers, a labeled dNTP can be included 
in the PCR reaction mixture. PCR amplification can be performed in a DNA 
thermal cycler according to conventional techniques. After a suitable number of 
rounds to achieve amplification, the PCR reaction mixture is electrophoresed on 
a polyacrylamide gel. After drying the gel, the radioactivity of the appropriate 
bands (corresponding to the mRNA encoding the Streptococcus polypeptides)) 
is quantified using an imaging analyzer. RT and PCR reaction ingredients and 
conditions, reagent and gel concentrations, and labeling methods are well 
known in the art. Variations on the RT-PCR method will be apparent to the 
skilled artisan. 

Assaying Streptococcus polypeptide levels in a biological sample can 
occur using any art-known method. Preferred for assaying Streptococcus 
polypeptide levels in a biological sample are antibody-based techniques. For 
example. Streptococcus polypeptide expression in tissues can be studied with 
classical immunohistological methods. In these, the specific recognition is 
provided by the primary antibody (polyclonal or monoclonal) but the secondary 
detection system can utilize fluorescent, enzyme, or other conjugated secondary 
antibodies. As a result, an immunohistological staining of tissue section for 
pathological examination is obtained. Tissues can also be extracted, e.g., with 
urea and neutral detergent, for the liberation of Streptococcus polypeptides for 
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Westem-blot or dot/slot assay (Jalkanen, M., et ai, J. Cell Biol. 707:976-985 
(1985); Jalkanen, M., et aL J. Cell . Biol. 705:3087-3096 (1987)). In this 
technique, which is based on the use of cationic solid phases, quantitation of a 
Streptococcus polypeptide can be accomplished using an isolated Streptococcus 

5 polypeptide as a standard. This technique can also be applied to body fluids. 

Other antibody-based methods useful for detecting Streptococcus 
polypeptide gene expression include immunoassays, such as the enzyme linked 
immunosorbent assay (ELISA) and the radioimmunoassay (RIA). For 
example, a Streptococcus polypeptide-specific monoclonal antibodies can be 

10 used both as an immunoabsorbent and as an enzyme-labeled probe to detect and 

quantify a Streptococcus polypeptide. The amount of a Streptococcus 
polypeptide present in the sample can be calculated by reference to the amount 
present in a standard preparation using a linear regression computer algorithm. 
Such an ELISA for detecting a tumor antigen is described in Iacobelli et ai, 

15 Breast Cancer Research and Treatment 77:19-30 (1988). In another ELISA 

assay, two distinct specific monoclonal antibodies can be used to detect 
Streptococcus polypeptides in a body fluid. In this assay, one of the antibodies 
is used as the immunoabsorbent and the other as the enzyme-labeled probe. 

The above techniques may be conducted essentially as a "one-step" or 

20 "two-step" assay. The "one-step" assay involves contacting the Streptococcus 

polypeptide with immobilized antibody and, without washing, contacting the 
mixture with the labeled antibody. The "two-step" assay involves washing 
before contacting the mixture with the labeled antibody. Other conventional 
methods may also be employed as suitable. It is usually desirable to immobilize 

25 one component of the assay system on a support, thereby allowing other 

components of the system to be brought into contact with the component and 
readily removed from the sample. 

Streptococcus polypeptide-specific antibodies for use in the present 
invention can be raised against an intact S. pneumoize polypeptide of the present 

30 invention or fragment thereof. These polypeptides and fragments may be 

administered to an animal (e.g., rabbit or mouse) either with a carrier protein 
(e.g., albumin) or, if long enough (e.g., at least about 25 amino acids), without 
a carrier. 

As used herein, the term "antibody" (Ab) or "monoclonal antibody" 
35 (Mab) is meant to include intact molecules as well as antibody fragments (such 

as, for example, Fab and F(ab'\ fragments) which are capable of specifically 
binding to a Streptococcus polypeptide. Fab and F(ab')^ fragments lack the Fc 
fragment of intact antibody, clear more rapidly from the circulation, and may 
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have less non-specific tissue binding of an intact antibody (Wahl et aL, J. Nucl. 
Med. 24:316-325 (1983)). Thus, these fragments are preferred. 

The antibodies of the present invention may be prepared by any of a 
variety of methods. For example, the S. pneumoniae polypeptides identified in 
Table 1, or fragments thereof, can be administered to an animal in order to 
induce the production of sera containing polyclonal antibodies. In a preferred 
method, a preparation of a S. pneumoniae polypeptide of the present invention 
is prepared and purified to render it substantially free of natural contaminants. 
Such a preparation is then introduced into an animal in order to produce 
polyclonal antisera of high specific activity. 

In the most preferred method, the antibodies of the present invention are 
monoclonal antibodies. Such monoclonal antibodies can be prepared using 
hybridoma technology (Kohler et ai f Nature 256:495 (1975); Kohler et aL, 
Eur. J. Immunol. 6:511 (1976); Kohler et aL, Eur. 7. Immunol. 6:292 (1976); 
Hammerling et aL, In: Monoclonal Antibodies and T-Cell Hybridomas, 
Elsevier, N.Y., (1981) pp. 563-681 ). In general, such procedures involve 
immunizing an animal (preferably a mouse) with a S. pneumoniae polypeptide 
antigen of the present invention. Suitable cells can be recognized by their 
capacity to bind anti-Streptococcus polypeptide antibody. Such cells may be 
cultured in any suitable tissue culture medium; however, it is preferable to 
culture cells in Earle's modified Eagle's medium supplemented with 10% fetal 
bovine serum (inactivated at about 56°C), and supplemented with about 10 g/1 
of nonessential amino acids, about L0OO U/ml of penicillin, and about 100 
Hg/ml of streptomycin. The splenocytes of such mice are extracted and fused 
with a suitable myeloma cell line. Any suitable myeloma cell line may be 
employed in accordance with the present invention; however, it is preferable to 
employ the parent myeloma cell line (SP^O), available from the American Type 
Culture Collection, Rockville, Maryland. After fusion, the resulting hybridoma 
cells are selectively maintained in HAT medium, and then cloned by limiting 
dilution as described by Wands et aL (Gastroenterology 50:225-232 (1981)). 
The hybridoma cells obtained through such a selection are then assayed to 
identify clones which secrete antibodies capable of binding the Streptococcus 
polypeptide antigen administered to immunized animal. 

Alternatively, additional antibodies capable of binding to Streptococcus 
polypeptide antigens may be produced in a two-step procedure through the use 
of anti-idiotypic antibodies. Such a method makes use of the fact that antibodies 
are themselves antigens, and that, therefore, it is possible to obtain an antibody 
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which binds to a second antibody. In accordance with this method, 

Streptococcus polypeptide-specific antibodies are used to immunize an animal, 

preferably a mouse. The splenocytes of such an animal are then used to 

produce hybridoma cells, and the hybridoma cells are screened to identify 

5 clones which produce an antibody whose ability to bind to the Streptococcus 

polypeptide-specific antibody can be blocked by a Streptococcus polypeptide 

antigen. Such antibodies comprise anti-idiotypic antibodies to the Streptococcus 

polypeptide-specific antibody and can be used to immunize an animal to induce 

formation of further Streptococcus polypeptide-specific antibodies. 
10 It will be appreciated that Fab and F(ab') 2 and other fragments of the 

antibodies of the present invention may be used according to the methods 
disclosed herein. Such fragments are typically produced by proteolytic 
cleavage, using enzymes such as papain (to produce Fab fragments) or pepsin 
(to produce F(ab') 2 fragments). Alternatively, Streptococcus 

15 polypeptide-binding fragments can be produced through the application of 

recombinant DNA technology or through synthetic chemistry. 

Of special interest to the present invention are antibodies to 
Streptococcus polypeptide antigens which are produced in humans, or are 
"humanized" (i.e., non-immunogenic in a human) by recombinant or other 

20 technology. Humanized antibodies may be produced, for example by replacing 

an immunogenic portion of an antibody with a corresponding, but non- 
immunogenic portion {i.e., chimeric antibodies) (Robinson, R.R. et ai, 
International Patent Publication PCT/US86/02269; Akira, K. et aL European 
Patent Application 184,187; Taniguchi, M., European Patent Application 

25 171,496; Morrison, S.L. et aL European Patent Application 173,494; 

Neuberger, M.S. et al., PCT Application WO 86/01533; Cabilly, S. et al, 
European Patent Application 125,023; Better, M. etal, Science 
240:1041-1043 (1988); Liu, A.Y. et al, Proc. Natl Acad. Sci USA 
54:3439-3443 (1987); Liu, A.Y. et al., J. Immunol 739:352 1-3526 (1987); 

30 Sun, L.K. etal, Proc. Natl Acad. Sci. USA 54:214-218 (1987); Nishimura, 

Y. etal, Cane. Res. 47:999-1005 (1987); Wood, C.R. et aL Nature 
j/4:446-449 (1985)); Shaw etal, J. Natl. Cancer Inst. SO: 1553-1559 (1988). 
General reviews of "humanized" chimeric antibodies are provided by Morrison, 
S.L. {Science, 229:1202-1207 (1985)) and by Oi, V.T. etaL BioTechniques 

35 4:214 (1986)). Suitable "humanized" antibodies can be alternatively produced 

by CDR or CEA substitution (Jones, P.T. etal. Nature 527:552-525 (1986); 
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Verhoeyan etai., Science 259:1534 (1988); Beidler, CB. et aL J- Immunol 
/4/:4053-4060(1988)). 

Suitable enzyme labels include, for example, those from the oxidase 
group, which catalyze the production of hydrogen peroxide by reacting with 
substrate. Glucose oxidase is particularly preferred as it has good stability and 
its substrate (glucose) is readily available. Activity of an oxidase label may be 
assayed by measuring the concentration of hydrogen peroxide formed by the 
enzyme-labeled antibody/substrate reaction. Besides enzymes, other suitable 

125 121 14 

labels include radioisotopes, such as iodine ( I, I), carbon ( C) t sulphur 
( 35 S), tritium ( 3 H), indium ( M2 ln), and technetium ( Wm Tc), and fluorescent 
labels, such as fluorescein and rhodamine, and biotin. 

Further suitable labels for the Streptococcus polypeptide-specific 
antibodies of the present invention are provided below. Examples of suitable 
enzyme labels include malate dehydrogenase, staphylococcal nuclease, 
delta-5-steroid isomerase, yeast-alcohol dehydrogenase, alpha-glycerol 
phosphate dehydrogenase, triose phosphate isomerase, peroxidase, alkaline 
phosphatase, asparaginase, glucose oxidase, beta-galactosidase, ribonuclease, 
urease, catalase, glucose-6-phosphate dehydrogenase, glucoamylase, and 
acetylcholine esterase. 

. , • . 3.. 1 1 ■ _ 125. 131 

Examples of suitable radioisotopic labels include H, In, I, I, 

32^ 35 0 14^ 51^ 57^ 58 59_ 75_ 152_ 90 67 21 V* 211 a, 212 DK 

P, S, C, Cr, To, Co, Fe, Se, Eu, Y, Cu, Ci, At, Pb, 
4? Sc, ,W Pd. etc. 111 In is a preferred isotope where in vivo imaging is used since 
its avoids the problem of dehalogenation of the I25 I or 1 I-labeled monoclonal 
antibody by the liver. In addition, this radionucleotide has a more favorable 
gamma emission energy for imaging (Perkins et al, Eur. J. Nucl. Med. 
70:296-301 (1985); Carasquillo etai, / Nucl. Med. 25:281-287 (1987)). For 
example, "'in coupled to monoclonal antibodies with 
l-(P-isothiocyanatobenzyI)-DPTA has shown little uptake in non-tumorous 
tissues, particularly the liver, and therefore enhances specificity of tumor 
localization (Esteban et aL J. Nucl. Med. 25:861-870 (1987)). 

Examples of suitable non-radioactive isotopic labels include Gd, 

55* , I62__ 52_ , 56_ 

Mn, Dy, Tr, and Fe. 

152 

Examples of suitable fluorescent labels include an Eu label, a 
fluorescein label, an isothiocyanate label, a rhodamine label, a phycoerythrin 
label, a phycocyanin label, an allophycocyanin label, an o-phthaldehyde label, 
and a fluorescamine label. 

Examples of suitable toxin labels include diphtheria toxin, ricin, and 
cholera toxin. 
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Examples of chemiluminescent labels include a luminal label, an 
isoluminal label, an aromatic acridinium ester label, an imidazole label, an 
acridinium salt label, an oxalate ester label, a luciferin label, a luciferase label, 
and an aequorin label. 

Examples of nuclear magnetic resonance contrasting agents include 
heavy metal nuclei such as Gd, Mn, and iron. 

Typical techniques for binding the above-described labels to antibodies 
are provided by Kennedy et ai, Clin. Chim. Acta 70:1-31 (1976), and Schurs 
et ai, Clin, Chim. Acta 81:1-40 (1977). Coupling techniques mentioned in the 
latter are the glutaraldehyde method, the periodate method, the dimaleirnide 
method, the m-maleimidobenzyl-N-hydroxy-succinimide ester method, all of 
which methods are incorporated by reference herein. 

In a related aspect, the invention includes a diagnostic kit for use in 
screening serum containing antibodies specific against S. pneumoniae 
infection. Such a kit may include an isolated S. pneumoniae antigen comprising 
an epitope which is specifically immunoreactive with at least one anti-S. 
pneumoniae antibody. Such a kit also includes means for detecting the binding 
of said antibody to the antigen. In specific embodiments, the kit may include a 
recombinantly produced or chemically synthesized peptide or polypeptide 
antigen. The peptide or polypeptide antigen may be attached to a solid support. 

In a more specific embodiment, the detecting means of the above- 
described kit includes a solid support to which said peptide or polypeptide 
antigen is attached. Such a kit may also include a non-attached reporter-labelled 
anti-human antibody. In this embodiment, binding of the antibody to the S . 
pneumoniae antigen can be detected by binding of the reporter labelled antibody 
to the anti-5. pneumoniae antibody. 

In a related aspect, the invention includes a method of detecting S . 
pneumoniae infection in a subject. This detection method includes reacting a 
body fluid, preferrably serum, from the subject with an isolated S. pneumoniae 
antigen, and examining the antigen for the presence of bound antibody. In a 
specific embodiment, the method includes a polypeptide antigen attached to a 
solid support, and serum is reacted with the support. Subsequently, the support 
is reacted with a reporter-labelled anti-human antibody. The support is then 
examined for the presence of reporter-labelled antibody. 

The solid surface reagent employed in the above assays and kits is 
prepared by known techniques for attaching protein material to solid support 
material, such as polymeric beads, dip sticks, 96-well plates or filter material. 
These attachment methods generally include non-specific adsorption of the 
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protein to the support or covalent attachment of the protein t typically through a 
free amine group, to a chemically reactive group on the solid support, such as 
an activated carboxyl, hydroxyl, or aldehyde group. Alternatively, streptavidin 
coated plates can be used in conjunction with biotinylated antigen(s). 

Therapeutics and Modes of Administration 

The present invention also provides vaccines comprising one or more 
polypeptides of the present invention. Heterogeneity in the composition of a 
vaccine may be provided by combining S. pneumoniae polypeptides of the 
present invention. Multi-component vaccines of this type are desirable because 
they are likely to be more effective in eliciting protective immune responses 
against multiple species and strains of the Streptococcus genus than single 
polypeptide vaccines. Thus, as discussed in detail below, a multi-component 
vaccine of the present invention may contain one or more, preferably 2 to about 
20, more preferably 2 to about 15, and most preferably 3 to about 8, of the S. 
pneumoniae polypeptides identified in Table 1, or fragments thereof. 

Multi-component vaccines are known in the art to elicit antibody 
production to numerous immunogenic components. Decker, M. and Edwards, 
K., J. Infect. Dis. 774:S270-275 (1996). In addition, a hepatitis B, diphtheria, 
tetanus, pertussis tetravalent vaccine has recently been demonstrated to elicit 
protective levels of antibodies in human infants against all four pathogenic 
agents. Aristegui, J. et ai t Vaccine 75:7-9 (1997). 

The present invention thus also includes multi-component vaccines. 
These vaccines comprise more than one polypeptide, immunogen or antigen. 
An example of such a multi-component vaccine would be a vaccine comprising 
more than one of the S. pneumoniae polypeptides described in Table 1. A 
second example is a vaccine comprising one or more, for example 2 to 10, of 
the 5. pneumoniae polypeptides identified in Table 1 and one or more, for 
example 2 to 10, additional polypeptides of either streptococcal or 
non-streptococcal origin. Thus, a multi-component vaccine which confers 
protective immunity to both a Streptococcal infection and infection by another 
pathogenic agent is also within the scope of the invention. 

As indicated above, the vaccines of the present invention are expected to 
elicit a protective immune response against infections caused by species and 
strains of Streptococcus other than strain ofS. pneumoniae deposited with that 
ATCC. 

Further within the scope of the invention are whole cell and whole viral 
vaccines. Such vaccines may be produced recombinantly and involve the 
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expression of one or more of the S. pneumoniae polypeptides described in 

Table 1. For example, the S. pneumoniae polypeptides of the present invention 

may be either secreted or localized intracellular, on the cell surface, or in the 

periplasmic space. Further, when a recombinant virus is used, the S. 

pneumoniae polypeptides of the present invention may, for example, be 

localized in the viral envelope, on the surface of the capsid, or internally within 

the capsid. Whole cells vaccines which employ cells expressing heterologous 

proteins are known in the art. See, e.g., Robinson, K. et aL, Nature Biotech. 

75:653-657 (1997); Sirard, J. et aL, Infect. Immun. 65:2029-2033 (1997); 

Chabalgoity, J. et aL Infect. Immun. 65:2402-2412 (1997). These cells may 

be administered live or may be killed prior to administration. Chabalgoity, J. et 

aL, supra, for example, report the successful use in mice of a live attenuated 
* 

Salmonella vaccine strain which expresses a portion of a platyhelminth fatty 
acid-binding protein as a fusion protein on its cells surface. 

A multi-component vaccine can also be prepared using techniques 
known in the art by combining one or more S. pneumoniae polypeptides of the 
present invention, or fragments thereof, with additional non-streptococcal 
components (e.g., diphtheria toxin or tetanus toxin, and/or other compounds 
known to elicit an immune response). Such vaccines are useful for eliciting 
protective immune responses to both members of the Streptococcus genus and 
non-streptococcal pathogenic agents. 

The vaccines of the present invention also include DNA vaccines. DNA 
vaccines are currently being developed for a number of infectious diseases. 
Boyer, J et aL Nat. Med. 5:526-532 (1997); reviewed in Spier, R„ Vaccine 
74:1285-1288 (1996). Such DNA vaccines contain a nucleotide sequence 
encoding one or more S. pneumoniae polypeptides of the present invention 
oriented in a manner that allows for expression of the subject polypeptide. The 
direct administration of plasmid DNA encoding B. burgdorgeri OspA has been 
shown to elicit protective immunity in mice against borrelial challenge. Luke, 
C. etaL J. Infect. Dis. 775:91-97 (1997). 

The present invention also relates to the administration of a vaccine 
which is co-administered with a molecule capable of modulating immune 
responses. Kim, J. et aL Nature Biotech. 75:641-646 (1997), for example, 
report the enhancement of immune responses produced by DNA immunizations 
when DNA sequences encoding molecules which stimulate the immune 
response are co-administered. In a similar fashion, the vaccines of the present 
invention may be co-administered with either nucleic acids encoding immune 
modulators or the immune modulators themselves. These immune modulators 
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include granulocyte macrophage colony stimulating factor (GM-CSF) and 
CD86. 

The vaccines of the present invention may be used to confer resistance to 
streptococcal infection by either passive or active immunization. When the 

5 vaccines of the present invention are used to confer resistance to streptococcal 

infection through active immunization, a vaccine of the present invention is 
administered to an animal to elicit a protective immune response which either 
prevents or attenuates a streptococcal infection. When the vaccines of the 
present invention are used to confer resistance to streptococcal infection through 

10 passive immunization, the vaccine is provided to a host animal (e.g., human, 

dog, or mouse), and the antisera elicited by this antisera is recovered and 
directly provided to a recipient suspected of having an infection caused by a 
member of the Streptococcus genus. 

The ability to label antibodies, or fragments of antibodies, with toxin 

15 molecules provides an additional method for treating streptococcal infections 

when passive immunization is conducted. In this embodiment, antibodies, or 
fragments of antibodies, capable of recognizing the S. pneumoniae polypeptides 
disclosed herein, or fragments thereof, as well as other Streptococcus proteins, 
are labeled with toxin molecules prior to their administration to the patient. 

20 When such toxin derivatized antibodies bind to Streptococcus cells, toxin 

moieties will be localized to these cells and will cause their death. 

The present invention thus concerns and provides a means for 
preventing or attenuating a streptococcal infection resulting from organisms 
which have antigens that are recognized and bound by antisera produced in 

25 response to the polypeptides of the present invention. As used herein, a vaccine 

is said to prevent or attenuate a disease if its administration to an animal results 
either in the total or partial attenuation (i.e., suppression) of a symptom or 
condition of the disease, or in the total or partial immunity of the animal to the 
disease. 

30 The administration of the vaccine (or the antisera which it elicits) may be 

for either a "prophylactic" or "therapeutic" purpose. When provided 
prophylactically, the compound(s) are provided in advance of any symptoms of 
streptococcal infection. The prophylactic administration of the compound(s) 
serves to prevent or attenuate any subsequent infection. When provided 

35 therapeutically, the compound(s) is provided upon or after the detection of 

symptoms which indicate that an animal may be infected with a member of the 
Streptococcus genus. The therapeutic administration of the compound(s) serves 
to attenuate any actual infection. Thus, the S. pneumoniae polypeptides, and 
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fragments thereof, of the present invention may be provided either prior to the 
onset of infection (so as to prevent or attenuate an anticipated infection) or after 
the initiation of an actual infection. 

The polypeptides of the invention, whether encoding a portion of a 
native protein or a functional derivative thereof, may be administered in pure 
form or may be coupled to a macromolecular carrier. Example of such carriers 
are proteins and carbohydrates. Suitable proteins which may act as 
macromolecular carrier for enhancing the immunogenicity of the polypeptides of 
the present invention include keyhole limpet hemacyanin (KLH) tetanus toxoid, 
pertussis toxin, bovine serum albumin, and ovalbumin. Methods for coupling 
the polypeptides of the present invention to such macromolecular carriers are 
disclosed in Harlow et al. t Antibodies: A Laboratory Manual 2nd Ed.; Cold 
Spring Harbor Laboratory Press, Cold Spring Harbor, New York (1988), the 
entire disclosure of which is incorporated by reference herein. 

A composition is said to be "pharmacologically acceptable" if its 
administration pan be tolerated by a recipient animal and is otherwise suitable for 
administration to that animal. Such an agent is said to be administered in a 
"therapeutically effective amount" if the amount administered is physiologically 
significant. An agent is physiologically significant if its presence results in a 
detectable change in the physiology of a recipient patient. 

While in all instances the vaccine of the present invention is administered 
as a pharmacologically acceptable compound, one skilled in the art would 
recognize that the composition of a pharmacologically acceptable compound 
varies with the animal to which it is administered. For example, a vaccine 
intended for human use will generally not be co-administered with Freund's 
adjuvant. Further, the level of purity of the S. pneumoniae polypeptides of the 
present invention will normally be higher when administered to a human than 
when administered to a non-human animal. 

As would be understood by one of ordinary skill in the art, when the 

vaccine of the present invention is provided to an animal, it may be in a 

composition which may contain salts, buffers, adjuvants, or other substances 

which are desirable for improving the efficacy of the composition. Adjuvants 

are substances that can be used to specifically augment a specific immune 

response. These substances generally perform two functions: (1) they protect 

the antigen(s) from being rapidly catabolized after administration and (2) they 

nonspecifically stimulate immune responses. 

Normally, the adjuvant and the composition are mixed prior to 

presentation to the immune system, or presented separately, but into the same 



WO 98/18930 



PCT/US97/19422 



40 

site of the animal being immunized. Adjuvants can be loosely divided into 
several groups based upon their composition. These groups include oil 
adjuvants (for example, Freund's complete and incomplete), mineral salts (for 
example, A1K(S0 4 ) 2 , AlNafSO^, A1NH 4 (S0 4 ), silica, kaolin, and carbon), 

polynucleotides (for example, poly IC and poly AU acids), and certain natural 
substances (for example, wax D from Mycobacterium tuberculosis, as well as 
substances found in Corynebacterium parvum, or Bordetella pertussis, and 
members of the genus Brucella. Other substances useful as adjuvants are the 
saponins such as, for example, Quil A. (Superfos A/S, Denmark). Preferred 
adjuvants for use in the present invention include aluminum salts, such as 
A1K(S0 4 \, AINa(SO and A1NH 4 (S0 4 ). Examples of materials suitable for 

use in vaccine compositions are provided in Remington's Pharmaceutical 
Sciences (Osol, A, Ed, Mack Publishing Co, Easton, PA, pp. 1324-1341 
(1980), which reference is incorporated herein by reference). 

The therapeutic compositions of the present invention can be 
administered £arenterally by injection, rapid infusion, nasopharyngeal 
absorption (intranasopharangeally), dermoabsorption, or orally. The 
compositions may alternatively be administered intramuscularly, or 
intravenously. Compositions for parenteral administration include sterile 
aqueous or non-aqueous solutions, suspensions, and emulsions. Examples of 
non-aqueous solvents are propylene glycol, polyethylene glycol, vegetable oils 
such as olive oil, and injectable organic esters such as ethyl oleate. Carriers or 
occlusive dressings can be used to increase skin permeability and enhance 
antigen absorption. Liquid dosage forms for oral administration may generally 
comprise a liposome solution containing the liquid dosage form. Suitable forms 
for suspending liposomes include emulsions, suspensions, solutions, syrups, 
and elixirs containing inert diluents commonly used in the art, such as purified 
water. Besides the inert diluents, such compositions can also include adjuvants, 
wetting agents, emulsifying and suspending agents, or sweetening, flavoring, 
or perfuming agents. 

Therapeutic compositions of the present invention can also be 
administered in encapsulated form. For example, intranasal immunization of 
mice against Bordetella pertussis infection using vaccines encapsulated in 
biodegradable microsphere composed of poly(DL-lactide-co-glycolide) has been 
shown to stimulate protective immune responses. Shahin, R. et ai, Infect. 
Immun. 65:1195-1200 (1995). Similarly, orally administered encapsulated 
Salmonella typhimurium antigens have also been shown to elicit protective 
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immunity in mice. Allaoui-Attarki, K. et ai, Infect Immun. 65:853-857 
(1997). Encapsulated vaccines of the present invention can be administered by 
a variety of routes including those involving contacting the vaccine with mucous 
membranes (e.g., intranasally, intracolonicly, intraduodenally). 

Many different techniques exist for the timing of the immunizations 
when a multiple administration regimen is utilized. It is possible to use the 
compositions of the invention more than once to increase the levels and 
diversities of expression of the immunoglobulin repertoire expressed by the 
immunized animal. Typically, if multiple immunizations are given, they will be 
given one to two months apart. 

According to the present invention, an "effective amount" of a 
therapeutic composition is one which is sufficient to achieve a desired biological 
effect. Generally, the dosage needed to provide an effective amount of the 
composition will vary depending upon such factors as the animal's or human's 
age, condition, sex, and extent of disease, if any, and other variables which can 
be adjusted by pne of ordinary skill in the art. 

The antigenic preparations of the invention can be administered by either 
single or multiple dosages of an effective amount. Effective amounts of the 
compositions of the invention can vary from 0.01-1,000 ng/ml per dose, more 
preferably 0.1-500 |ig/ml per dose, and most preferably 10-300 |ig/ml per dose. 

Having now generally described the invention, the same will be more 
readily understood through reference to the following example which is 
provided by way of illustration, and is not intended to be limiting of the present 
invention, unless specified. 

Examples 

Example 1: Expression and Purification of S. pneumoniae 
Polypeptides in E. coli 

The bacterial expression vector pQElO (QIAGEN, Inc., 9259 Eton 
Avenue, Chatsworth, CA, 91311) is used in this example for cloning of the 
nucleotide sequences shown in Table 1 and for expressing the polypeptides 
identified in Table 1. The components of the pQElO plasmid are arranged such 
that the inserted DNA sequence encoding a polypeptide of the present invention 
expresses the polypeptide with the six His residues (i.e., a "6 X His tag")) 
covalently linked to the amino terminus. 

The DNA sequences encoding the desired portions of the polypeptides 
of Table 1 are amplified using PCR oligonucleotide primers from either a DNA 
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library constructed from S. pnuemonicae, such as the one deposited by the 
inventors at the ATCC for convenience, ATCC Deposit No. 97755, or from 
DNA isolated from the same organism such as the S. pneumoniae strain 
deposited with the ATCC as Deposit No. 55840. A list of PCR primers which 
can be used for this purpose is provided in Table 3, below. The PCR primers 
anneal to the nucleotide sequences encoding both the amino terminal and 
carboxy terminal amino acid sequences of the desired portion of the 
polypeptides of Table 1. Additional nucleotides containing restriction sites to 
facilitate cloning in the pQElO vector were added to the 5' and 3* primer 
sequences, respectively. Such restriction sites are listed in Table 3 for each 
primer. In each case, the primer comprises, from the 5' end, 4 random 
nucleotides to prevent "breathing" during the annealing process, a restriction site 
(shown in Table 3), and approximately 15 nucleotides of S. pneumoniae ORF 
sequence (the complete sequence of each cloning primer is shown as SEQ ID 
NO:227 through SEQ ID NO:452). 

For cloning the polypeptides of Table 1, the 5' and 3* primers were 
selected to amplify their respective nucleotide coding sequences. One of 
ordinary skill in the art would appreciate that the point in the protein coding 
sequence where the 5' primer begins may be varied to amplify a DNA segment 
encoding any desired portion of the complete amino acid sequences described in 
Table 1 . Similarly, one of ordinary skill in the an would further appreciate that 
the point in the protein coding sequence where the 3* primer begins may also be 
varied to amplify a DNA segment encoding any desired portion of the complete 
amino acid sequences described in Table 1. 

The amplified DNA fragment and the pQElO vector are digested with the 
appropriate restriction enzyme(s) and the digested DNAs are then ligated 
together. The ligation mixture is transformed into competent E. coli cells using 
standard procedures such as those described in Sambrook et ai y Molecular 
Cloning: a Laboratory Manual 2nd Ed; Cold Spring Harbor Laboratory Press, 
Cold Spring Harbor, N.Y. (1989). Transformants are identified by their ability 
to grow under selective pressure on LB plates. Plasmid DNA is isolated from 
resistant colonies and the identity of the cloned DNA confirmed by restriction 
analysis, PCR and DNA sequencing. 

Clones containing the desired constructs are grown overnight ("O/N") in 
liquid culture under selection. The O/N culture is used to inoculate a large 
culture, at a dilution of approximately 1:25 to 1:250. The cells are grown to an 
optical density at 600 nm ("OD600") of between 0.4 and 0.6. Isopropyl-b-D- 
thiogalactopyranoside ('TPTG") is then added to a final concentration of 1 mM 



WO 98/18930 



PCT/US97/19422 



43 

to induce transcription from the lac repressor sensitive promoter, by inactivating 
the loci repressor. Cells subsequently arc incubated further for 3 to 4 hours. 
Cells are then harvested by centrifugation. 

The cells are stirred for 3-4 hours at 4 C in 6M guanidine-HCl, pH 8 . 
The cell debris is removed by centrifugation, and the supernatant containing the 
protein of interest is loaded onto a nickel-nitrilo-tri-acetic acid ("NiNTA") 
affinity resin column (available from QIAGEN, Inc., supra). Proteins with a 
6x His tag bind to the NI-NTA resin with high affinity and can be purified in a 
simple one-step procedure (for details see: The QIAexpressionist, 1995, 
QIAGEN, Inc., supra). Briefly, the supernatant is loaded onto the column in 6 
M guanidine-HCl, pH8, the column is first washed with 10 volumes of 6 M 
guanidine-HCl, pH8, then washed with 10 volumes of 6 M guanidine-HCl 
pH6, and finally the polypeptide is eluted with 6 M guanidine-HCl, pH 5.0. 

The purified protein is then renatured by dialyzing it against phosphate- 
buffered saline (PBS) or 50 mM Na-acetate, pH 6 buffer plus 200 mM NaCl. 
Alternatively, the protein can be successfully refolded while immobilized on the 
Ni-NTA column. The recommended conditions are as follows: renature using 
a linear 6M-1M urea gradient in 500 mM NaCl, 20% glycerol, 20 mM Tris/HCl 
pH7.4, containing protease inhibitors. The renaturation should be performed 
over a period of 1.5 hours or more. After renaturation the proteins can be eluted 
by the addition of 250 mM imidazole. Imidazole is removed by a final dialyzing 
step against PBS or 50 mM, sodium acetate pH6 buffer plus 200 mM NaCl. 
The purified protein is stored at 4°C or frozen at -80*C. 

The DNA sequences encoding the amino acid sequences of Table 1 may 
also be cloned and expressed as fusion proteins by a protocol similar to that 
described directly above, wherein the pET-32b(+) vector (Novagen, 601 
Science Drive, Madison, WI 5371 1) is preferentially used in place of pQElO. 

Each of the polynucleotides shown in Table 1, was successfully 
amplified and subcloned into pQElO as described above using the PCR primers 
shown in Table 3. These pQElO plasmids containing the DNAs of Table 1, 
except SP023, SP042, SP054, SP063, SP081, SP092, SP114, SP122, 
SP123, SP126, and SP127, were deposited with the ATCC as a pooled deposit 
as a convenience to those of skill in the art. This pooled deposit was desposited 
on October 16, 1997 and given ATCC Deposit No. 209369. Those of ordinary 
skill in the art appreciate that isolating an individual plasmid from the pooled 
deposit is trivial provided the information and reagents described herein. Each 
of the deposited clones is capable of expressing its encoded 5. pneumoniae 
polypeptide. 
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Example 2: Immunization and Detection of Immune Responses 
Methods 

Growth of bacterial innoculum, immunization of Mice and 
Challenge with S pneumoniae. 

Propagation and storage of, and challenge by S. pneumoniae are 
preformed essentially as described in Aaberge, I.S. et al., Virulence of 
Streptococcus pneumoniae in mice: a standardized method for preparation and 
frozen storage of the experimental bacterial inoculum, Microbial Pathogenesis, 
18:141 (1995), incorporated herein by reference. 

Briefly, Todd Hewitt (TH) broth (Difco laboratories, Detroit, MI) with 
17% FCS, and horse blood agar plates are used for culturing the bacteria. Both 
broth and blood plates are incubated at 37°C in a 5% CO, atmosphere. Blood 
plates are incubated for 18 hr. The culture broth is regularly 10-fold serially 
diluted in TH broth kept at room temperature and bacterial suspensions are kept 
at room temperature until challenge of mice. 

For active immunizations C3H/HeJ mice (The Jackson Laboratory, Bar 
Harbor, ME) are injected intraperitoneally (i.p.) at week 0 with 20 g of 
recombinant streptococcal protein, or phosphate-buffered saline (PBS), 
emulsified with complete Freund's adjuvant (CFA), given a similar booster 
immunization in incomplete Freund's adjuvant (IFA) at week 4, and challenged 
at week 6. For challenge 5. pneumoniae are diluted in TH broth from 
exponentially-growing cultures and mice are injected subcutaneously (s.c.) at 
the base of the tail with 0.1 ml of these dilutions (serial dilutions are used to find 
medium infectious dose). Streptococci used for challenge are passaged fewer 
than six times in vitro. To assess infection, blood samples are obtained from 
the distal part of the lateral femoral vein into heparinized capillary tubes. A 25 
ul blood sample is serially 10-fold diluted in TH broth, and 25 ul of diluted and 
undiluted blood is plated onto blood agar plates. The plates are incubated for 18 
hr. and colonies are counted. 

Other methods are known in the art, for example, see Langermann, S. et- 
al., J. Exp. Med., 180:2277 (1994), incorporated herein by reference. 
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Immunoassays 

Several immunoassay formats are used to quantify levels of 
streptococcal-specific antibodies (ELISA and immunoblot), and to evaluate the 
functional properties of these antibodies (growth inhibition assay). The ELISA 
and immunoblot assays are also used to detect and quantify antibodies elicited in 
response to streptococcal infection that react with specific streptococcal 
antigens. Where antibodies to certain streptococcal antigens are elicited by 
infection this is taken as evidence that the streptococcal proteins in question are 
expressed in vivo. Absence of infection-derived antibodies (seroconversion) 
following streptococcal challenge is evidence that infection is prevented or 
suppressed. The immunoblot assay is also used to ascertain whether antibodies 
raised against recombinant streptococcal antigens recognize a protein of similar 
size in extracts of whole streptococci. Where the natural protein is of similar, or 
identical, size in the immunoblot assay to the recombinant version of the same 
protein, this is taken as evidence that the recombinant protein is the product of a 
full-length c\ox)C of the respective gene. 

Enzyme-Linked Immunosorbant Assay (ELISA). 
The ELISA is used to quantify levels of antibodies reactive with streptococcus 
antigens elicited in response to immunization with these streptococcal antigens. 
Wells of 96 well microliter plates (Immunlon 4, Dynatech, Chantilly, Virginia, 
or equivalent) are coated with antigen by incubating 50 1 of 1 g/ml protein 
antigen solution in a suitable buffer, typically 0. 1 M sodium carbonate buffer at 
pH 9.6. After decanting unbound antigen, additional binding sites are blocked 
by incubating 100 1 of 3% nonfat milk in wash buffer (PBS, 0.2% Tween 20, 
pH 7.4). After washing, duplicate serial two-fold dilutions of sera in PBS, 
Tween 20, 1% fetal bovine serum, are incubated for 1 hr, removed, wells are 
washed three times, and incubated with horseradish peroxidase-conjugated goat 
anti-mouse IgG. After three washes, bound antibodies are detected with H2O2 

and 2,2'-azino-di-(3-ethylbenzthiazoline sulfonate) (Schwan, T.G., et aL Proc. 
Natl Acad. Sci. USA 92:2909-2913 (1985)) (ABTS®, Kirkegaard & Perry 
Labs., Gaithersburg, MD) and A405 is quantified with a Molecular Devices, 

Corp. (Menlo Park, California) Vmax™ plate reader. IgG levels twice the 
background level in serum from naive mice are assigned the minimum titer of 
1:100. 



WO 98/18930 



PCT/US97/19422 



46 

Sodiumdodecylsulfate-Polyacrylamide Gel Electrophoresis 
(SDS-PAGE) and Immunoblotting 

Using a single well format, total streptococcal protein extracts or 
recombinant streptococcal antigen are boiled in SDS/2-ME sample buffer before 
electrophoresis through 3% acrylamide stacking gels, and resolving gels of 
higher acrylamide concentration, typically 10-15% acrylamide monomer. Gels 
are electro-blotted to nitrocellulose membranes and lanes are probed with 
dilutions of antibody to be tested for reactivity with specific streptococcal 
antigens, followed by the appropriate secondary antibody-enzyme (horseradish 
peroxidase) conjugate. When it is desirable to confirm that the protein had 
transferred following electro-blotting, membranes are stained with Ponceau S . 
Immunoblot signals from bound antibodies are detected on x-ray film as 
chemiluminescence using ECL™ reagents (Amersham Corp., Arlington 
Heights, Illinois). 

Example 3: Detection of Streptococcus mRNA expression 

Northern blot analysis is carried out using methods described by, among 
others, Sambrook et aL, supra, to detect the expression of the S. pneumoniae 
nucleotide sequences of the present invention in animal tissues. A cDNA probe 
containing an entire nucleotide sequence shown in Table 1 is labeled with 32p 
using the red/prime™ DNA labeling system (Amersham Life Science), 
according to manufacturer's instructions. After labeling, the probe is purified 
using a CHROMA SPIN- 100™ column (Clontech Laboratories, Inc.), 
according to manufacturer's protocol number PT 1200-1. The purified labeled 
probe is then used to detect the expression of Streptococcus mRNA in an animal 
tissue sample. 

Animal tissues, such as blood or spinal fluid, are examined with the 
labeled probe using ExpressHyb™ hybridization solution (Clontech) according 
to manufacturer's protocol number PT 11 90-1. Following hybridization and 
washing, the blots are mounted and exposed to film at -70 C overnight, and 
films developed according to standard procedures. 
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It will be clear that the invention may be practiced otherwise than as 
particularly described in the foregoing description and examples. 

Numerous modifications and variations of the present invention are 
possible in light of the above teachings and, therefore, are within the scope of 
5 the appended claims. 

The entire disclosure of all publications (including patents, patent 
applications, journal articles, laboratory manuals, books, or other documents) 
cited herein are hereby incorporated by reference. 
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SP001 nucleotide ( SEQ ID NO:l) 

TAAAATCTACG AC AAT AAAAATCAACTC ATTGC TGACTTGGGTTC TG AACGCCGCGTC AATGCC C AAGC 
TAATG AT ATTC CC AC AGATTTGGTT AAGGC AAT C GTTTC T ATC G AAG ACC ATC G CTTCTTC G AC C AC AG 
GGGGATTGATACCATCCGTATCCTGGGAGCTTTCTTGCGCAATCTGCAAAGCAATTCCCTCCAAGGTGG 
ATCAACTCTCACCCAACAGTTGATTAAGTTGACTTACTTTTCAACTTCGACTTCCGACCAGACTATTTC 
TCGTAAGGCTCAGGAAGCTTGGTTAGCGATTCAGTTAGAACAAAAAGCAACCAAGCAAGAAATCTTGAC 
CTACTATATAAATAAGGTCTACATGTCTAATGGGAACTATGGAATGCAGACAGCAGCTCAAAACTACTA 
TGGTAAAGACCTCAATAATTTAAGTTTACCTCAGTTAGCCTTGCTGGCTGGAATGCCTCAGGCACCAAA 
CCAATATGACCCCTATTCACATCCAGAAGCAGCCCAAGACCGCCGAAACTTGGTCTTATCTGAAATGAA 
AAATCAAGGCTACATCTCTGCTGAACAGTATGAGAAAGCAGTCAATACACCAATTACTGATGGACTACA 
AAGTCTCAAATCAGCAAGTAATTACCCTGCTTACATGGATAATTACCTCAAGGAAGTCATCAATCAAGT 
TGAAGAAGAAACAGGCTATAACCTACTCACAACTGGGATGGATGTCTACACAAATGTAGACCAAGAAGC 
TCAAAAACATCTGTGGGATATTTACAATACAGACGAATACGTTGCCTATCCAGACGATGAATTGCAAGT 
CGCTTCT AC C ATTG TTG ATG TTT CTAACGGTAAAGTC ATTGC C C AG CT AGGAGC ACGC C ATC AGTC AAG 
TAATGTTTCCTTCGGAATTAACCAAGCAGTAGAAACAAACCGCGACTGGGGATCAACTATGAAACCGAT 
CACAGACTATGCTCCTGCCTTGGAGTACGGTGTCTACGATTCAACTGCTACTATCGTTCACGATGAGCC 
CTATAACTACCCTGGGACAAATACTCCTGTTTATAACTGGGATAGGGGCTACTTTGGCAACATCACCTT 
GC AAT ACGC CCTGCAACAATCGCGAAACGTCCCAGCCGTGGAAACTCTAAACAAGGTCGGACTCAACCG 
CGCCAAGACTTTCCTAAATGGTCTAGGAATCGACTACCC.\AGTATTCACTACTCAAATGCCATTTCAAG 
TAACACAACCGAATCAGACAAAAAATATGGAGCAAGTAGTGAAAAGATGGCTGCTGCTTACGCTGCCTT 
TGCAAATGGTGGAACTTACTATAAACCAATGTATATCCATAAAGTCGTCTTTAGTGATGGGAGTGAAAA 
AGAGTTCTCTAATGTCGGAACTCGTGCCATGAAGGAAACGACAGCCTATATGATGACCGACATGATGAA 
AACAGTCTTGACTTATGGAACTGGACGAAATGCCTATCTTGCTTGGCTCCCTCAGGCTGGTAAAACAGG 
AACCTCTAACTATACAGACGAGGAAATTGAAAACCACATCAAGACCTCTCAATTTGTAGCACCTGATGA 
ACTATTTGCTGGCTATACGCGTAAATATTCAATGGCTGTATGGACAGGCTATTCTAACCGTCTGACACC 
ACTTGTAGGCAATGGCCTTACGGTCGCTGCCAAAGTTTACCGCTCTATGATGACCTACCTGTCTGAAGG 
AAGCAATCCAGAAGATTGGAATATACCAGAGGGGCTCTACAGAAATGGAGAATTCGTATTTAAAAATGG 
TGCTCGTTCTACGTGGAACTCACCTGCTCCACAACAACCCCCATCAACTGAAAGTTCAAGCTCATCATC 
AGATAGTTCAACTTCACAGTCTAGCTCAACCACTCCAAGCACAAATAATAGTACGACTACCAATCCTAA 
CAATAATACGCAACAATCAAATACAACCCCTGATCAACAAAATCAGAATCCTCAACCAGCACAACCA 

SPO 0 1 AMINO ACID ( SEQ ID NO : 2 ) 

KIYDNKNQLIADLG5ERRVNAQANDIPTDLVKAIVSIEDHRFFDKRGIDTIRILGAFLRNLQSNSLQGG 
STLTQQLIKLTYFSTSTSDQTISRKAQEJWIJ^IQ^ 

GKDLNNLSLPQLALLAGMPQAPNQYDPYSHPEAAQDRRNLVLSEMKNQGYISAEQYEKAVNTPITDGLQ 
SLKSASNYPAYMD^LKTVINQVEEETGYNLLTTGMDVYTNVDQEAQKHLWDIYNTDEYVAYP 
ASTIVUVSNGKVIAQLGARHQSSNVSFGINQAVETNRDWGSTMKPITDYAPALEYGVYDSTATIVHDEP 
YNYPGTNTPVYNWDRGYFGNITLQYALQQSRNVPAVETLNKVGLNRAKTFLNGLGIDYPSIHYSNAISS 
NTTESDKKYGAS S EKMAAAYAAFANGGTYYKPMY I HKWF SDG S EKE? SNVGTRAMKETTAYMMTDMMK 
TVLTYGTGRNAYLAWLPQAGKTGTSNYTDEEIENHIKTSQFVAPDELr AGYTRKYSMAVWTGYSNRLT? 
LVGNGLTVAAKVYRSMMTYLSEGSNPEDWNIPEGLYRNGEFVFKNGARSTWNSPAPQQPPSTESSSSSS 
DSSTSQS S STT PSTNNSTTTNPNNNTQQSNTT PDQQNQNPQ PAQ P 

5P004 nucleotide (SEQ ID NO : 3 ) 

AAATTACAATACGGACTATGAATTGACCTCTGGAGAAAAATTACCTCTTCCTAAAGAGATTTCAGGTTA 
CACTTATATTGGATATATCAAAGAGGGAAAAACGACTTCTGAGTCTGAAGTAAGTAATCAAAAGAGTTC 
AGTTGCCACTCCTACAAAACAACAAAAGGTGGATTATAATGTTACACCGAATTTTGTAGACCATCCATC 
AACAGTACAAGCTATTCAGGAACAAACACCTGTTTCTTCAACTAAGCCGACAGAAGTTCAAGTAGTTGA 
AAAACCTTTCTCTACTGAATTAATCAATCC.2lAGAAAAGAAGAGAAACAATCTTCAGATTCTCAAGAACA 
ATT AGC CG AAC AT AAG AATCT AG AAACG AAG AAAG AGG AG AAG ATTTC T C C AAAAGAAAAG ACTGGGGT 
AAATACATTAAATCCACAGGATGAAGTTTTATCAGGTCAATTGAACAAACCTGAACTCTTATATCGTGA 
GG AAAC T ATGG AG AC AAAAAT AG ATTTTC AAG AAG AAATTC AAG AAAAT C C TG ATTT AGCTGAAGG AAC 
TCTAAGAGTAAAACAAGAAGGTAAATTAGGTAAGAAAGTTGAAATCGTCAGAATATTCTCTGTAAACAA 
GGAAGAAGTTTCGCGAGAAATTGTTTCAACTTCAACGACTGCGCCTAGTCCAAGAATAGTCGAAAAAGG 
TACTAAAAAAACTCAAGTTATAAAGGAACAACCTGAGACTGGTGTAGAACATAAGGACGTACAGTCTGG 
AGCTATTGTTGAACCCGCAATTCAGCCTGAGTTGCCCGAAGCTGTAGTAAGTGACAAAGGCGAACCAGA 
AGTTCAACCTACATTACCCGAAGCAGTTGTGACCGACAAAGGTGAGACTGAGGTTCAACCAGAGTCGCC 
AGATACTGTGGTAAGTGATAAAGGTGAACCAGAGCAGGTAGCACCGCTTCCAGAATATAAGGGTAATAT 
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TGAGCAAGTAAAACCTGAAACTCCGGTTGAGAAGACCAAAGAACAAGGTCCAGAAAAAACTGAAGAAGT 
TCCAGTAAAACCAACAGAAGAAACACCAGTAAATCCAAATGAAGGTACTACAGAAGGAACCTCAATTCA 
AGAAGCAGAAAATCCAGTTCAACCTGCAGAAGAATCAACAACGAATTCAGAGAAAGTATCACCAGATAC 
ATCTAGCAAAAATACTGGGGAAGTGTCCAGTAATCCTAGTGATTCGACAACCTCAGTTGGAGAATCAAA 
TAAACCAGAACATAATGACTCTAAAAATGAAAATTCAGAAAAAACTGTAGAAGAAGTTCCAGTAAATCC 
AAATGAAGGCACAGTAGAAGGTACCTCAAATCAAGAAACAGAAAAACCAGTTCAACCTGCAGAAGAAAC 
ACAAACAAACTCTGGGAAAATAGCTAACGAAAATACTGGAGAAGTATCCAATAAACCTAGTGATTCAAA 
ACCACCAGTTGAAGAATCAAATCAACCAGAAAAAAACGGAACTGCAACAAAACCAGAAAATTCAGGTAA 
TACAACATCAGAGAATGGACAAACAGAACCAGAACCATCAAACGGAAATTCAACTGAGGATGTTTCAAC 
CGAATCAAACACATCCAATTCAAATGGAAACGAAGAAATTAAACAAGAAAATGAACTAGACCCTGATAA 
AAAGGTAGAAGAACCAGAGAAAACACTTGAATTAAGAAATGTTTCCGACCTAGAGTTA 

SP004 amino acid (SEQ ID NO: 4) 

NYNTDYELTSGEKLPLPKEISGYTYIGYIKEGKTTSESEVSNQKSSVATPTKQQKVDYNVTPNFVDKPS 
TVQAIQEQTPVSSTKPTEVQWEKPFSTELINPRKEEKQSSDSQEQLAEHKNLETKKEEKISPKEKTGV 
NT^PQDEVLSGQUTCPELLYREETMETKIDFQEEIQENPDLAEG 

EEVSREIVSTSTTAPSPRIVEKGTKKTQVIKEQPETGVEHKDVQSGAIVEPAIQPELPEAWSDKGEPE 
VQPTLPEAWTDKGETEVQPESPD7WSDKGEPEQVAPLPEYKGNIEQVKPETPVEKTKE0GPEKTEEV 
PVKPTEETPVNPNEGTTEGTSIQEAENPVQPAEESTTNSEKVSPDTSSKNTGEVSSNPSDSTTSVGESN 
KPEHNDSKNENSEKTVEEVPWPNEGTVEGTSNQETEKPVQPAEETQTNSGKIANENTGEVSNKPSOSK 
PPVEESNQPEKNGTATKPENSGMTTSENGQTEPEPSNGNSTEDVSTESNTSNSNGNEEIKQENELDPDK 

KVEEPEKTLELRNVSDLEL 

SP006 nucleotide { SEQ ZD NO: 5) 

TGAGAATCAAGCTACACCCAAAGAGACTAGCGCTCAAAAGACAATCGTCCTTGCTACAGCTGGCGACGT 
GCCACCATTTGACTACGAAGACAAGGGCAATCTGACAGGCTTTGATATCGAAGTTTTAAAGGCAGTAGA 
TGAAAAACTCAGCGACTACGAGATTCAATTCCAAAGAACCGCCTGGGAGAGCATCTTCCCAGGACTTGA 
TTCTGGTC ACT ATC AGG C TGCGGCC AAT AACTTGAGTT AC AC AAAAG AGCG TGCTG AAAAAT ACCTTT A 
CTCGCTTCCAATTTCCAACAATCCCCTCGTCCTTGTCAGCAACAAGAAAAATCCTTTGACTTCTCTTGA 
CC AGAT CGC TGG T AAAAC AAC AC AAG AGG AT AC CGG AACTTCT AACG CTC AATTC ATC AAT AAC TGG AA 
TCAGAAACACACTGATAATCCCGCTACAATTAATTTTTCTGGTGAGGATATTGGTAAACGAATCCTAGA 
CCTTGCT AAC GG AG AGTTTGATTTCCTAGTTTTTG AC AAGGTATCCGTTC AAAAG ATT ATC AAGG AC CG 
TGGTTT AG AC CTCTC AGTCGTTG ATTT AC C TTC TGC AGAT AGC CC C AGC AATT AT ATC ATTTTC TC AAG 
CGACCAAAAAGAGTTTAAAGAGCAATTTGATAAAGCGCTCAAAGAACTCTATCAAGACGGAACCCTTGA 
AAAACTCAGCAATACCTATCTAGGTGGTTCTTACCTCCCAGATCAATCTCAGTTACAA 

SP006 amino acid { SEQ ID NO: 6) 

ENQATPKETSAQKTIVLATAGDVPPFDYEDKGNLTGFDIEVLKAVDEKLSDYEIQFQRTAWESIFPGLD 
SGHYQAAANNLSYTKERAEKYLYSLPISNNPLVLVSbnCKNPLTSLI^IAGKTTQEDTGTSNAQFINN^ 
QKHTDNPATINFSGEDIGKRILDLANGEFDFLVFDKVSVQKIIKDRGLDLSVVDLPSADSPSNYIIFSS 
DQKEFKEQFDKALKELYQDGTLEKLSNTYLGGSYLPDQSQLQ 

SP007 nucleotide (SEQ ID NO:7) 

TGGTAACCGCTCTTCTCGTAACGCAGCTTCATCTTCTGATGTGAAGACAAAAGCAGCAATCGTCACTGA 
TACTGGTGGTGTTGATGACAAATCATTCAACCAATCAGCTTGGGAAGGTTTGCAGGCTTGGGGTAAAGA 
ACACAATCTTTCAAAAGATAACGGTTTCACTTACTTCCAATCAACAAGTGAAGCTGACTACGCTAACAA 
CTTGCAACAAGCGGCTGGAAGTTACAACCTAATCTTCGGTGTTGGTTTTGCCCTTAATAATGCAGTTAA 
AGATGCAGCAAAAGAACACACTGACTTGAACTATGTCTTGATTGATGATGTGATTAAAGACCAAAAGAA 
TGTTGCGAGCGTAACTTTCGCTGATAATGAGTCAGGTTACCTTGCAGGTGTGGCTGCAGCAAAAACAAC 
T AAG AC AAAAC AAGTTGGTTTTGTAGGTGGTATCGAATCTGAAGTTATCTCTCGTTTTGAAGC AGG ATT 
CAAGGCTGGTGTTGCGTCAGTAGACCCATCTATCAAAGTCCAAGTTGACTACGCTGGTTCATTTGC-TGA 
TGCGGCTAAAGGTAAAACAATTGCAGCCGCACAATACGCAGCCGGTGCAGATATTGTTTACCAAGTAGC 
TGGTGGTACAGGTGCAGGTGTCTTTGCAGAGGCAAAATCTCTCAACGAAAGCCGTCCTGAAAATGAAAA 
AGTTTGGGTTATCGGTGTTGATCGTGACCAAGAAGCAGAAGGTAAATACACTTCTAAAGATGGCAAAGA 
ATC AAAC TTTG T TC TTGT ATC T A CTTTG AAAC AAGTTGGT AC AACTGT AAAAG AT ATTTCT AAC AAGGC 
AGAAAGAGGAGAATTCCCTGGCGGTCAAGTGATCGTTTACTCATTGAAGGATAAAGGGGTTGACTTGGC 
AG T AAC AAAC CTTTCAGAAGAAGGTAAAAAAG C TGTCGAAGATGC AAAAGC T AAAATC CTTG ATGG AAG 
CGTAAAAGTTCCTGAAAAA 
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SP007 amino acid (SEQ ID NO: 8) 

GNRSSRNAASSSDVKTKAAIVTDTGGVDDKSFNQSAWEGLQAWGKEHNLSKDNGFTYFQSTSEADYANN 
LQQAAGSYNLIFGVGFALNNAVKDAAKEHTDLNYVLI DDVI KDQKNVASVTFADNESGYLAGVAAAKTT 
5CTKQVGFVGGI E S EVI S RF EAGFKAGVASVDP S I KVQVDYAGS FGDAAKGKT I AAAQ YAAGADI VYQVA 
GGTGAGVFAEAKSLNESRPENEKVWIGVDRDQEAEGKYTSKDGKESNFVLVST^ 
ERGEFPGGQVIWSLKDKGVDLAVTNLSEEGKKAVEDAKAKILDGSVKVPEK 

SP008 nucleotide (SEQ ID NO: 9) 

TGTGGAAATTTGACAGGTAACAGCAAAAAAGCTGCTGATTCAGGTGACAAACCTGTTATCAAAATGTAC 
CAAATCGGTGACAAACCAGACAACTTGGATGAATTGTTAGCAAATGCCAACAAAATCATTGAAGAAAAA 
GTTGGTGCCAAATTGGATATCCAATACCTTGGCTGGGGTGACTATGGTAAGAAAATGTCAGTTATCACA 
TCATCTGGTGAAAACTATGATATTGCCTTTGCAGATAACTATATTGTAAATGCTCAAAAAGGTGCTTAC 
GCTGACTTGACAGAATTGTACAAAAAAGAAGGTAAAGACCTTTACAAAGCACTTGACCCAGCTTACATC 
AAGGGTAATACTGTAAATGGTAAGATTTACGCTGTTCCAGTTGCAGCCAACGTTGCATCATCTCAAAAC 
TTTGCCTTCAACGGAACTCTCCTTGCTAAATATGGTATCGATATTTCAGGTGTTACTTCTTACGAAACT 
CTTG AGC C AGTCTTGAAAC AAATC AAAG AAAAAG CTC C AGACGT AGT AC C ATTTGCT ATTGGT AAAGTT 
TTCATCCCATCTGATAATTTTGACTACCCAGTAGCAAACGGTCTTCCATTCGTTATCGACCTTGAAGGC 
GATACTACTAAAGTTGTAAACCGTTACGAAGTGCCTCGTTTCAAAGAACACTTGAAGACTCTTCACAAA 
TTCTATGAAGCTGGCTACATTCCAAAAGACGTCGCAACAAGCGATACTTCCTTTGACCTTCAACAAGAT 
ACTTGGTTCGTTCGTGAAGAAACAGTAGGACCAGCTGACTACGGTAACAGCTTGCTTTCACGTGTTGCC 
AACAAAGATATCCAAATCAAACCAATTACTAACTTCATCAAGNAAAACCAAACAACACAAGTTGCTAAC 
TTTGTCATCTCAAACAACTCTAAGAACAAAGAAAAATCAATGGAAATCTTGAACCTCTTGAATACGAAC 
C C AG AAC TC TTG AAC GG TCTTG TTT ACGG TC C AG AAGGC AAG AACTGGG AAAAAATTG AAGG T AAAG AA 
AACCGTGTTCGCGTTCTTGATGGCTACAAAGGAAACACTCACATGGGTGGATGGAACACTGGTAACAAC 
TGGATCCTTTACATCAACGAAAACGTTACAGACCAACAAATCGAAAATTCTAAGAAAGAATTGGCAGAA 
GCT AAAG AATCTCCAGCGCTTGGATTTATCTTCAATACTG AC AATGTG AAATC TGAAATCTCAGCT ATT 
G C T AAC AC AATGC AAC AATTTG AT AC AGCT AT C AAC ACTGGT ACTGT AGAC C C AG AT AAAGCG ATTCC A 
GAATTGATGGAAAAATTGAAATCTGAAGGTGCCTACGAAAAAGTATTGAACGAAATGCAAAAACAATAC 
G ATG AATTC TTG AAAAAC AAAAAA 

SP008 amino acid (SEQ ID NO: 10) 

CGNLTGNSKKAADSGDKPVIKMYQIGDKPDNLDELLANANKIIEEKVGAKLDIQYLGWGDYGKXMSVIT 
SSGEbTTOIAFADtTC IVNAQKGAYADLTELYKKEGKDLYKALD^ 

rAFNGTLLAKYGIDISGVTSYETLEPVLKQIKEKAPDWPFAIGKVFIPSDNFDYPVANGLPFVIDLEG 
DTTKVVmYEVPRFKEHLKTLHKFYEAGYIPKDVATSDTSFDLQQDTWFVP.EETVGPADYGNSLLSRVA 
NXDIQIKPITNFIXXNQTTQVANFVISNNSKNKEKSM 

NRVRVLDG YKGNTHMGGWNTGNNWI L Y INENVTDQQ I ENS KKE LAE AKES PALGFI FNTDNVKS E I S A I 
ANTMQQFDTAINTGTVDPDKAIPELMEKLKSEGAYEKVLNEMQKQYDEFLKNKK 

SF009 nucleotide (SEQ ID NO: 11) 

TGGTCAAGGAACTGCTTCTAAAGACAACAAAGAGGCAGAACTTAAGAAGGTTGACTTTATCCTAGACTG 
GACACCAAATACCAACCACACAGGGCTTTATGTTGCCAAGGAAAAAGGTTATTTCAAAGAAGCTGGAGT 
GGATGTTGATTTGAAATTGCCACCAGAAGAAAGTTCTTCTGACTTGGTTATCAACGGAAAGGCACCATT 
TGCAGTGTATTTCCAAGACTACATGGCTAAGAAATTGGAAAAAGGAGCAGGAATCACTGCCGTTGCAGC 
TATTGTTGAACACAATACATCAGGAATCATCTCTCGTAAATCTGATAATGTAAGCAGTCCAAAAGACTT 
GGT TGGT AAG AAAT ATGGGAC ATGG AATG AC C C AACTG AAC TTGC T ATGTTG AAAAC CTTGGT AG AAT C 
TCAAGGTGGAGACTTTGAGAAGGTTGAAAAAGTACCAAATAACGACTCAAACTCAATCACACCGATTGC 
CAATGGCGTCTTTGATACTGCTTGGATTTACTACGGTTGGGATGGTATCCTTGCTAAATCTCAAGGTGT 
AG ATGC T AAC TTC ATGT ACTTG AAAG AC T ATGTC AAGG AGTTTG ACTAC T ATTC AC C AGTT ATC ATCGC 
AAAC AAC G AC T ATC TG AAAG AT AAC AAAG AAG AAGCTCGC AAAG TC ATCC AAGC C ATC AAAAAAGGCT A 
CCAATATGCCATGGAACATCCAGAAGAAGCTGCAGATATTCTCATCAAGAATGCACCTGAACTCAAGGA 
AAAACGTGACTTTGTCATCGAATCTCAAAAATACTTGTCAAAAGAATACGCAAGCGACAAGG^AAAATG 
GGGTC AATTTG ACGC AGC TCGCTGGAATGCTTTCT AC AAATGGG AT AAAG AAAATGGT ATC CTT AAAG A 
AGAC TTGACAGACAAAGGC TTC ACCAACGAATTTGTG AAA 

SF009 amino acid (SEQ ID NO: 12) 
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GQGTASKDNKEAELKKVDF I LDWTPNTNHTGLYVAKEKGYFKEAGVDVDLKLPPEES S SDLVINGKAPF 
AVYFQDYMAKKLEKGAGITAVAAIVEHNTSGIISRKSDtA/SSPKDLVGKKYGTWNDPTELAMLKTLVES 
QGGDFEKVEKVPNNDSNSITPIANGVFDTAWIYYGWIXSILAKSQGV^ 

NNDYLKDNKEEARKVIQAIKKGYQYAMEHPEEAADILIKNAPELKEKRDFVIESQKYLSKEYASDKEKW 
GQFDAARWNAF YKWDKENG I LXEDLTDKGFTNEFVK 

SP010 nucleotide (SEQ ID NO : 13) 

TAGCTCAGGTGGAAACGCTGGTTCATCCTCTGGAAAAACAACTGCCAAAGCTCGCACTATCGATGAAAT 
CAAAAAAAGCGGTGAACTGCGAATCGCCGTGTTTGGAGATAAAAAACCGTTTGGCTACGTTGACAATGA 
TGGTTCTACCAAGGTACGCTACGATATTGAACTAGGGAACCAACTAGCTCAAGACCTTGGTGTCAAGGT 
TAAATACATTTCAGTCGATGCTGCCAACCGTGCGGAATACTTGATTTCAAACAAGGTAGATATTACTCT 
TGCTAACTTTACAGTAACTGACGAACGTAAGAAACAAGTTCATTTTC 

TCTGGGTGTCGTATCACCTAAGACTGGTCTCATTACAGACGTCAAACAACTTGAAGGTAAAACCTTAAT 

TGTCACAAAAGGAACGACTGCTGAGACTTATTTTGAAAAGAATCATCCAGAAATCAAACTCCAAAAA 

CGACCAATACAGTGACTCTTACCAAGCTCTTCTTGACGGACGTGGAGATGCCTTTTCAACTGACAATAC 

GGAAGTTCTAGCTTGGGCGCTTGAAAATAAAGGATTTGAAGTAGGAATTACTTCCCTCGGTGATCCCGA 

TACCATTGCGGCAGCAGTTCAAAAAGGCAACCAAGAATTGCTAGACTTCATCAATAAAGATATTGAAAA 

ATTAGGCAAGGAAAACTTCTTCCACAAGGCCTATGAAAAGACACTTCACCCAACCTACGGTGACGCTGC 

TAAAGCAGATGACCTGGTTGTTGAAGGTGGAAAAGTTGAT 

SP010 amino acid { SEQ ID NO: 14) 

SSGGNAGSSSGKTTAKARTIDEIKKSGELRIA\^GDKKPFGYVDNIX3STKVRYDIEI/3NQIAQDLGVW 

KY I S VD AANRAE YL I S NKVD I T LANFTVTDERKKQ VD F A L P YMKV S LG W S P KTGL I TDVXQ L EG KT L I 
VTKGTTAETYrEKNHPEIKLQKYDQYSDSYQALLDGRGDAFSTDNTEVLAWALENKGFEVGITSLGDPD 

T I AAA VQKGNQ E L LDF I NKD I EKLGKENF F H KAY E KT LH PT YGDAAKADD L WEGG KVD 
SP011 nucleotide { SEQ ID NO:15) 

CTCCAACTATGGTAAATCTGCGGATGGCACAGTGACCATCGAGTATTTCAACCAGAAAAAAGAAATGAC 
CAAAACCTTGGAAGAAATCACTCGTGATTTTGAGAAGGAAAA 

TGT AC C AAATGCTGGTG AAGT ATTG AAG AC ACGCGTTCTCGC AGG AG ATG TGCCTG ATGTGGTC AATAT 
TTACCCACAGTCCATCGAACTGCAAGAATGGGCAAAAGCAGGTGTTTTTGAAGATTTGAGCAACAAAGA 
CTACC TG AAAC GC GTG AAAAATGGCT AC GCTG AAAAAT ATGCTGT AAAC G AAAAAGTTT AC AACGTTCC 
TTTT AC AGCT AATGC TT ATGG AATTT ACT AC AAC AAAG AT AAATTCG AAG AAC TGGGCTTG AAGGTTC C 
TGAAACCTGGGATGAATTTGAACAGTTAGTCAAAGATATCGTTGCTAAAGGACAAACACCATTTGGAAT 
TGCAGGTGCAGATGCTTGGACACTCAATGGTTACAATCAATTAGCCTTTGCGACAGCAACAGGTGGAGG 
AAAAGAAGCAAATCAATACCTTCGTTATTCTCAACCAAATGCCATTAAATTGTCGGATCCGATTATGAA 
AG ATG AT ATC AAGGTC ATGG AC ATC C TTC G C ATC AATGG ATCT AAGC AAAAG AACTGGG AAGGTGC TGG 
CTATACCGATGTTATCGGAGCCTTCGCACGTGGGGATGTCCTCATGACACCAAATGGGTCTTGGGCGAT 
C AC AGC G ATT AATG AAC AAAAAC CGAACTTT AAG ATTGGGACCTTC ATG ATTC C AGG AAAAG AAAAAGG 
ACAAAGCTTAACCGTTGGTGCGGGAGACTTGGCATGGTCTATCTCAGCCACCACCAAACATCCAAAAGA 
AGC C AATGC CTTTGTGG AATAT ATG AC C CGTCC AG AAGTCATGC AAAAAT ACT ACGATGTGG AC GG ATC 
TC C AAC AGC G ATC GAAGGGGTC AAAC AAGC AGG AG AAG ATTC ACC GCTTGCTGGT ATG AC C G AAT ATGC 
CTTT ACGG ATCGTC ACTTGGTCTGGTTGCAAC AAT ACTGG AC C AGTG AAGC AG AC TTCC AT ACC TTG AC 
CATGAACTATCTCTTGACCGG^ 
GAAAGCGGATGTGGAT 

SP011 amino acid (SEQ ID NO: 16) 

SNYGKSADGTVTIEYFNQKKEMTKTLEEITRDFEKE^ 
YPQSIELQEWAKAGVFEDLSNKDYLKRVKNGYAEX 

ETWDEFEQLVKDIVAKGQTPFGIAGADAWTLNGYNQLAFATATGGGKEANQYLRYSQPNAIKLSDPIMK 
DDIKVKDILRINGSKQKNWEGAGYTDVIGAFARGDVLMTPNGSWAITAINEQKPNFKIGTFMIPGKEKG 
OSL TVGAGDLAWSI SATTKHPKEANAFVEYMTRPEVMQKYYDVDGS PTAIEGVKQAGEDS PLAGMTEYA 
FTDRHLVWLCQYVJTSEADFHTLTMNYVLTGDKQGMVNDLNAFFNPMKADVD 

SP012 nucleotide (SEQ ID NO: 17) 

TGGGAAAAATTCTAGCGAAACTAGTGGAGATAATTGGTCAAAGTACCAGTCTAACAAGTCTATTACTAT 
TGGATTTGATAGTACTTTTGTTCCAATGGGATTTGCTCAGAAAGATGGTTCTTATGCAGGATTTGATAT 
TG ATTT AGCT AC AGC TGTTTTTGAAAAAT ACGG AATCACGGTAAATTGGC AAC CG ATTG ATTGGG ATTT 
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GAAAG AAGCTG AATTG AC AAAAGG AACG AT TG ATCTG ATTTGG AATGGC T ATTC C GC T AC AG AC G AACG 

CCGTGAAAAGGTGGCTTTCAGTAACTCATATATGAAGAATGAGCAGGTATTGGTTACGAAGAAATCATC 

TGGTATCACGACTGCAAAGGATATGACTGGAAAGACATTAGGAGCTCAAGCTGGTTCATCTGGTTATGC 

GGACTTTGAAGCAAATCCAGAAATTTTGAAGAATATTGTCGCTAATAAGGAAGCGAATCAATACCAAAC 

CTTTAATGAAGCCTTGATTGATTTGAAAAACGATCGAATTGATGGTCTATTGATTGACC^ 

AAACTATTATTTAGAAGCAGAAGGTGTTTTAAACGATTATAATGTCTTTACAGTTGGACTAGAAACAGA 

AGCTTTTGCGGTTGGAGCCCGTAAGGAAGATACAAACTTGGTTAAGAAGATAAATGAAGCTTTTTCTAG 

TCTTTACAAGGACGGCAAGTTCCAAGAAATCAGCCAAAAATGGTTTGGAGAAGATGTAGCAACCAAAGA 

AGTAAAAGAAGGACAG 

SP012 nucleotide (SEQ ZD NO: 18) 

GKNSSETSGDNWSKYQSNKSITIGFDSTFVPMGFAQKDGSYAGFDIDLATAVTEKYGITVNWQPIDWDL 
KEAELTKGTIDLIWNGYSATDERREKVAFSNSYMKNEQVLVTKKSSGITTAKDMTGKTLGAQAGSSGYA 
DFEANPEI LKNI VANKEANQ YQTFNEAL I DLKNDR IDGLLI DRVYANYYLEAEGVLND YNVFTVGLETE 
AFAVGARKEDTNLVKKINEAFSSLYKDGKFQEISQKWFGEDVATKEVKEGQ 

SP013 nucleotide (SEQ ID NO: 19) 

TGCTAGCGGAAAAAAAGATACAACTTCTGGTCAAAAACTAAAAGTTGTTGCTACAAACTCAATCATCGC 
TGATATTACTAAAAATATTGCTGGTGACAAAATTGACCTTCATAGTATCGTTCCGATTGGGCAAGACCC 
ACACGAATACGAACCACTTCCTGAAGACGTT.^AGAAAACTTCTGAGGCTAA.TTTGATTTTCTATAACGG 
TATCAACCTTGAAACAGGTGGCAATGCTTGG'TTTACAAAATTGGTAGAAAATGCCAAGAAAACTGAAAA 
CAAAGACTACTTCGCAGTCAGCGACGGCGTTGATGTTATCTACCTTGAAGGTCAAAATGAAAAAGGAAA 
AGAAGACCCACACGCTTGGCTTAACCTTGAAAACGGTATTATTTTTGCTAAAAATATCGCCAAACAATT 
GAGCGCCAAAGACCCTAACAATAAAGAATTCTATGAAAAAAATCTCAAAGAATATACTGATAAGTTAGA 
C AAAC TTG AT AAAG AAAGT AAGG AT AAATTT AAT AAGATC CCTG CTG AAAAG AAAC T CATTGT AAC C AG 
CG AAGG AGC ATTC AAAT AC TTC TC T AAAG CC T ATGGTGTC C C AAG TGC TT AC ATCTGGG AAATC AAT AC 
TGAAGAAG AAGG AACTCCTG AAC AAATC AAG ACCTTGGTTG AAAAACTTC GC C AAAC AAAAGTTC C ATC 
ACTCTTTGTAGAATCAAGTGTGGATGACCGTCCAATGAAAACTGTTTCTCAAGACACAAACATCCCAAT 
CTACGCTCAAATCTTTACTGACTCTATCGCAGAACAAGGTAAAGAAGGCGACAGCTACTACAGCATGAT 
GAAATACAACCTTGACAAGATTGCTGAAGGATTGGCAAAA 

SP013 amino acid (SEQ ID NO: 20) 

ASGKKDTTSGQKLKWATNSIIADITKNIAGDKIDLKSIVPIGQDPHEYEPLPEDVKKTSEANLIFYNG 
INLETGGNAWFTKLVENAKKTENKDYFAVSDGVDVIYLEGQNEKGKEDPHAWLNLENGIIr AKNIAKQL 
SAKDPNNKEFYEKNLKEYTDKLDKLDKESKDKFNKIPAEKKLIVTSEGAFKYFSKAYGVPSAYIWEINT 
EEEGTPEQIKTLVEKLRQTKVPSLFVESSVDDRPMKTVSQDTNIPIYAQIFTDSIAEQGKEGDSYYSMM 

XYNLDKIAEGLAK 

SP014 nucleotide (SEQ ID NO: 21) 

TGGCTCAAAAAATACAGCTTCAAGTCCAGATTATAAGTTGGAAGGTGTAACATTCCCGCTTCAAGAAAA 
G AAAAC ATTG AAGTTTATG AC AGC C AGTTC AC C GTT ATC TCCT AAAG AC C C AAATG AAAAGTT AATTTT 
GCAACGTTTGGAGAAGGAAACTGGCGTTCATATTGACTGGACCAACTACCAATCCGACTTTGCAGAAAA 
AC GT AACTTGGAT ATTTCT AGTGG TG ATTT AC C AG ATGCT ATC C AC AAC G AC GG AGCTTC AG ATGTGG A 
C TTG ATGAACTGGGCTAAAAAAGGTGTT ATT ATTC CAGTTG AAG ATTTG ATTG AT AAAT AC ATGCC AAA 
TCTT AAG AAAATTTTGG ATG AG AAACC AG AG T AC AAGGCC TTG ATG AC AGC AC CTG ATGGGC AC ATTT A 
CTCATTTCCATGGATTGAAGAGCTTGGAGATGGTAAAGAGTCTATTCACAGTGTCAACGATATGGCTTG 
GATTAACAAAGATTGGCTTAAGAAACTTGGTCTTGAAATGCCAAAAACTACTGATGATTTGATT AAAGT 
CCTAGAAGCTTTCAAAAACGGGGATCCAAATGGAAATGGAGAGGCTGATGAAATTCCATTTTCATTTAT 
T AGTGG TAACGGAAACG AAG ATTTTAAATTC C T ATTTGCTGC ATTTGGT AT AGGGG AT AAC G ATG ATC A 
TTTAGTAGTAGGAAATGATGGCAAAGTTGACTTCACAGCAGATAACGATAACTATAAAGAAGGTGTCAA 
ATTT ATC C GT C AATTGC AAG AAAAAGGC CTG ATTG AT AAAG AAG C TTTCG AAC ATG ATTGG AAT AGTT A 
C ATTGCT AAAGGTC ATG ATC AG AAATTTGGTGTTT ACTTT AC ATGGG AT AAG AAT AATGTT AC TGG AAG 
TAACGAAAGTTATGATGTTTTACCAGTACTTGCTGGACCAAGTGGTCAAAAACACGTAGCTCGTACAAA 
CGGT ATGGG ATTTGC AC GTG AC AAG ATGGTT ATT AC C AGTG T AAAC AAAAAC CT AG AATTG AC AGCT AA 
ATGGATTG ATGC AC AAT AC GC TC C AC TC C AATCTGTGC AAAAT AAC TGGGG AAC TT ACGG AG ATG AC AA 
ACAACAAAACATCTTTGAATTGGATCAAGCGTCAAATAGTCTAAAACACTTACCACTAAACGGAACTGC 
AC C AGC AG AAC TT CGTC AAAAG AC TG AAGT AGG AGG AC C ACT AGCT ATCCT AG ATTC AT ACT ATGGT AA 
AGTAACAACCATGCCTGATGATGCCAAATGGCGTTTGGATCTTATCAAAGAATATTATGTTCCTTACAT 
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GAGCAATGTCAATAACTATCCAAGAGTCTTTATGACACAGGAAGATTTGGACAAGATTGCCCATATCGA 
AGCAGATATGAATGACTATATCTACCGTAAACGTGCTGAATGGATTGTAAATGGCAATATTGATACTGA 
GTGGGATGATTACAAGAAAGAACTTGAAAAATACGGACTTTCTGATTACCTCGCTATTAAACAAAAATA 
CTACGACCAATACCAAGCAAACAAAAAC 

SP014 amino acid (SEQ ZD NO:22) 

GSKNTASSPDYXLEGVTFPLQEKKTLKFT1TASSPLSPKDPNEKLILQRLEKETGVHIDWTNYQSDFAEK 
RNLDISSGDLPDAIHNDGASDVT)LMNWAKKGVIIPVEDLIDKYM^ 

SFPWIEELGDGKESIHSVNDMAWINKDWLKKLGLEMPKTTDDLIKVLEAFKNGDPNGNGEADEIPFSFI 

SGNGNEDFKFLFAAFGIGDNDDHLWGNDGKTOFTADNDNYKEGVTCFIRQLQEKGLIDKEAFEHDWNSY 

IAKGHDQKFGVYFTWDKNNVTGSNESYDVLPVIiAGPSGQKHVARTNGMGFARDKMV 

WIDAQYAPLQSVON^GTYGDDKQQNIFELDOASNSLKHLPLNGTAPAELRQKTEVGGPl*AII*DSYYGK 

VTTMPDDAKVTCLDLIKEYWPYMSNVNNYPRVFMT^ 

WDDYKKELEKYGLSDYLAIKQKYYDQYQANKN 

SP015 nucleotide (SEQ ID NO: 23) 

T AGT AC AAACTC AAGC AC T AGTCAGAC AG AG ACC AGT AGCTCTGC TCC AAC AG AGGT AAC C ATT AAAAG 
TTCACTGGACGAGGTCAAACTTTCCAAAGTTCCTGAAAAGATTGTGACCTTTGACCTCGGCGCTGCGGA 
TACTATTCGCGCTTTAGGATTTGAAAAAAATATCGTCGGAATGCCTACAAAAACTGTTCCGACTTATCT 
AAAAGACCTAGTGGGAACTGTCAAAAATGTTGGTTCTATGAAAGAACCTGATTTAGAAGCTATCGCCGC 
CCTTGAGCCTGATTTGATTATCGCTTCGCCACGTACACAAAAATTCGTAGACAAATTCAAAGAAATCGC 
CCCAACCGTTCTCTTCCAAGCAAGCAAGGACGACTACTGGACTTCTACCAAGGCTAATATCGAATCCTT 
AGCAAGTGCCTTCGGCGAAACTGGTACACAGAAAGCCAAGGAAGAATTGACCAAGCTAGACAAGAGCAT 
CCAAGAAGTCGCTACTAAAAATGAAAGCTCTGACAAAAAAGCCCTTGCGATCCTCCTTAATGAAGGAAA 
AATGGCAGCCTTTGGJGCCAAATCTCGTTTCTCTTTCTTGTACCAAACCTTGAAATTCAAACCAACTGA 
TACAAAATTTGAAGACTCACGCCACGGACAAGAAGTCAGCTTTGAAAGTGTCAAAGAAATCAACCCTGA 
CATCCTCTTTGTCATCAACCGTACCCTTGCCATCGGTGGGGACAACTCTAGCAACGACGGTGTCCTAGA 
AAATGCCCTTATCGCTGAAACACCTGCTGCTAAAAATGGTAAGATTATCCAACTAACACCAGACCTCTG 
GTATCTAAGCGGAGGCGGACTTGAATCAACAAAACTCATGATTGAAGACATACAAAAAGCTTTGAAA 

SP015 amino acid ( SEQ ZD NO: 24) 

STNSSTSQTETSSSAPTEVTIKSSLDEVKLSKVPEKIVTFDLGAADTIRALGFEKNIVGMPTKTVPTYL 
KDLVGTVKNVGSMKEPDLEAIAALEPDLIIASPRTQKFVDKFKEIAPTVLFQASKDDYWTSTKANIESL 
ASAFGETGTQKAKEELTKLDKSIQEVATKNESSDKKALAILLNEGKMAAFGAKSRFSFLYQTLKFKPTD 
TKFEDSRHGQEVSFE5VKEINPDILFVINRTLAIGGDNSSNDGVLENALIAETPAAKNGXIIQLTPDLW 
YLSGGGLESTKLMIEDIQXALK 

SP016 nucleotide (SEQ ID NO:25) 

TGGCAATTCTGGCGGAAGTAAAGATGCTGCCAAATCAGGTGGTGACGGTGCCAAAACAGAAATCACTTG 

G TGGGC ATTCC C AGT ATTT AC C C AAG AAAAAACTGGTG ACGGTGTTGG AACTT ATG AAAAATC AATC AT 

CGAAGCGTTTGAAAAAGCAAACCCAGATATAAAAGTGAAATTGGAAACCATCGACTTCAAGTCAGGTCC 

TGAAAAAATCACAACAGCCATCGAAGCAGGAACAGCTCCAGACGTACTCTTTGATGCACCAGGACGTAT 

CATCCAATACGGTAAAAACGGTAAATTGGCTGAGTTGAATGACCTCTTCACAGATGAATTTGT^ 

TGTCAACAATGAAAACATCGTACAAGCAAGTAAAGCTGGAGACAAGGCTTATATGTATCCGATTAGTTC 

TGCCCCATTCTACATGGCAATGAACAAGAAAATGTTAGAAGATGCTGGAGTAGCAAACCTTGTAAAAGA 

AGGTTGGACAACTGATGATTTTGAAAAAGTATTGAAAGCACTTAAAGACAAGGGTTACACACCAGGTTC 

ATTGTTCAGTTCTGGTCAAGGGGGAGACCAAGGAACACGTGCCTTTATCTCTAACCTTTATAGCGGTTC 

TGTAACAGATGAAAAAGTTAGCAAATATACAACTGATGATCCTAAATTCGTCAAAGGTCTTGAAAAAGC 

AACTAGCTGGATTAAAGACAATTTGATCAATAATGGTTCACAATTTGACGGTGGGGCAGATATCCAAAA 

CTTTGCCAACGGTCAAACATCTTACACAATCCTTTGGGCACCAGCTCAAAATGGTATCCAAGCTAAACT 

TTTAGAAGCAAGTAAGGTAGAAGTGGTAGAAGTACCATTCCCATCAGACGAAGGTAAGCCAGCTCTTGA 

GTACCTTGTAAACGGGTTTGCAGTATTCAACAATAAAGACGACAAGAAAGTCGCTGCATCTAAGAAATT 

CATCCAGTTTATCGCAGATGACAAGGAGTGGGGACCTAAAGACGTAGTTCGTACAGGTGCTTTCCCAGT 

CCGTACTTCATTTGGAAAACTTTATGAAGACAAACGCATGGAAACAATCAGCGGCTGGACTCAATACTA 

CTC AC CAT ACT AC AAC ACT ATTG ATGG ATTTGCTG AAATG AG AAC ACTTTGGTTC CC AATGTTGC AATC 

TGTATCAAATGGTGACGAAAAACCAGCAGATGCTTTGAAAGCCTTCACTGAAAAAGCGAACGAAACAAT 

CAAAAAAGCTATGAAACAA 
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SP016 amino acid (SEQ ID NO: 26) 

GNSGGSKDAAKSGGI>SAKTEITWWAFPVFTQEKTGDGVGTYEKSIIEAFEKANPDIKVKLETIDF 

EKI TT A I EAGT AP DVL FDA PGR 1 1 Q YG KNG K LAE LND L FTD EFVKDVNN EN I VQ A S KAGDKA YMY PISS 

APFYMAMNKKMLEDAGVANLVKEGWTTDDFEKVLKALKDKGYTPGSLFSSGQGGDQGTRAFISNLYSGS 

VTDEKVSKYTTDDPKFVKGLEKATSWIKDNLINNGSQFDGGADIQNFANGQTSYTILWAPAQNGIQAKL 

LEASKVEVVEVPFPSDEGKPALEYLVNGFAVFNNKDDKKVAASKKFIQFIADDKEWGPKDVVRT 

RTSFGKLYEDKHMETISGWTQYYSPYYNTIDGFAEMRTLV^PMLQSVSNGDEKPADALKAFTEKANETI 

KKAMKQ 

SP017 nucleotide (SEQ ID NO:27) 

TTCACAAGAAAAAACAAAkAATGAAGATGGAGAAACTAAGACAGAACAGACAGCCAAAGCTGATGGAAC 
AGTCGGTAGTAAGTCTCAAGGAGCTGCCCAGAAGAAAGCAGAAGTGGTCAATAAAGGTGATTACTACAG 
CATTCAAGGGAAATACGATGAAATCATCGTAGCCAACAAACACTATCCATTGTCTAAAGACTATAATCC 
AGGGGAAAATCCAACAGCCAAGGCAGAGTTGGTCAAACTCATCAAAGCGATGCAAGAGGCAGGTTTCCC 
TATTAGTGATCATTACAGTGGTTTTAGAAGTTATGAAACTCAGACCAAGCTCTATCAAGATTATGTCAA 
C C AAG ATGG AAAGGC AGCAGCTG ACCG TT AC TCTGC CCGTC CTGGCT AT AGCGAAC AC C AG AC AGGCTT 
GGC CTTTGATGTG ATTGGG ACTG ATGGTG ATTTGGTG AC AG AAG AAAAAGC AGC CC AATGG C TC TTGG A 
TC ATGC AGC TG ATT ATGGCTTTG TTGTCCG TTATCTC AAAGGC AAGGAAAAGGAAAC AGGCT AT ATGGC 
TGAAGAATGGCACCTGCGTTATGTAGGAAAAGAAGCTAAAGAAATTGCTGCAAGTGGTCTCAGTTTGGA 
AGAATACTATGGCTTTGAAGGCGGAGACTACGTCGAT 

SP017 amino acid (SEQ ID NO:28) 

SOEKTraJEDGETKTEQTAKAIXSTVGSKSQGAAQKKAEVVNKGDYYSIQGKYDEIIVANKhT 
GENPTAKAELVKLIKAMQEAGFPISDHYSGFRSYETQTKLYQDYVNQDGKAAADRYSARPGY5EHQTGL 
AFDVIGTDGDLVTEE / KAAQWLLDHAADYGFWRYLKGKEKETGYMAEEWHLRYVGKEAKEIAASGLSLE 
EYYGFEGGDYVD 

SP019 nucleotide (SEQ ID NO:29) 

GAAAGGTCTGTGGTCAAATAATCTTACCTGCGGTTATGATGAAAAAATAATCTTGGAAAATATAAATAT 
AAAAATACCTGAAGAAAAAATATCAGTTATTATTGGGTCAAATGGTTGTGGGAAATCAACACTCATTAA 
AAC CTTGTC TCG AC TT AT AAAGC C ATT AG AGGG AG AAGT ATTGCTTG AT AAT AAATC AATT AATTCTT A 
TAAAGAAAAAGATTTAGCAAAACACATAGCTATATTACCTCAATCTCCAATAATCCCTGAATCAATAAC 
AGT AGCTG ATCTTGT AAGC CGTGGTC GTTTC C C CT AC AG AAAGC CTTTT AAG AGTCTTGG AAAAG ATG A 
CCTTGAAATAATAAACAGATCAATGGTTAAGGCCAATGTTGAAGATCTAGCAAATAACCTAGTTGAAGA 
ACTTTCTGGGGGTC AAAGGC AAAG AGT ATGG AT AGCTCTAGCCC T AGC C C AAG ATAC AAG T ATC CT AC T 
TTT AG ATG AGC C AACT AC TT AC TTGG AT ATCTC AT ATC AAAT AG AACT ATT AG ACCTC TTG AC TG ATCT 
AAACCAAAAATATAAGACAACCATTTGCATGATTTTGCACGATATAAATCTAACAGCAAGATACGCTGA 
TTACCTATTTGCAATTAAAGAAGGTAAACTTGTTGCAGAGGGAAAGCCTGAAGATATACTAAATGATAA 
ACTAGTTAAAGATATCTTTAATCTTGAAGCAAAAATTATACGTGACCCTATTTCCAATTCGCCTCTAAT 
GATTCCTATTGGCAAGCACCATGTTAACTCT 

SP019 amino acid (SEQ ID NO: 30) 

KGLWSNNLTCGYDEKIILENINIKIPEEKISVIIGSNGCGKSTLIKTLSRLIKPLEGEVLLDNKSINSY 
KEKDIAKHIAILPQSPIIPESIWADLVSRGRFPYRKPFKSLGKDDLEIINRSMVTCANVEDIJ^ 
LSGGQRQRWIALAXAQDTSILLLDEPTTYLDISYQIELLDLLTDLNQKYKTTICMILHDINLTARYAD 
YLFAIKEGKLVAEGKPEDILNDKLVKDIFNLEAKIIRDPISNSPLMIPIGKHHVS 

SPO20 nucleotide ( SEQ ID NO:31) 

AAACTC AG AAAAG AAAGC AG AC AATGC AAC AACT ATC AAAATC GC AACTGTT AAC CGT AGC GG TTCTG A 
AGAAAAACGTTGGGACAAAATCCAAGAATTGGTTAAAAAAGACGGAATTACCTTGGAATTTACAGAGTT 
CACAGACTACTCACAACCAAACAAAGCAACTGCTGATGGCGAAGTAGATTTGAACGCTTTCCAACACTA 
TAACTTCTTGAACAACTGGAACAAAGAAAACGGAAAAGACCTTGTAGCGATTGCAGATACTTACATCTC 
TCCAATCCGCCTTTACTCAGGTTTGAATGGAAGTGCCAACAAGTACACTAAAGTAGAAGACATCCCAGC 
AAACGG AGAAATCGCTGT ACCG AATG AC GC T AC AAACGAAAGC CGTGCGCTTT ATTTGC TTC AATC AGC 
TGGCTTGATTAAATTGGATCTTTCTGGAACTGCTCTTGCAACAGTTGCCAACATCAAAGAAAATCCA^ 
GAACTTG AAAATC AC TGAATTGGACG CT AGC CAAACAGCTCGTTCATTGTCATCAGTTGACGCTGC CGT 
TGT AAAC AAT AC C TTCGT T AC AG AAGC AAAATTGG AC T AC AAG AAATC ACTTTTC AAAG AAC AAGCTG A 
TGAAAACTCAAAACAATGGTACAACATCATTGTTGCAAAAAAAGATTGGGAAACATCACCTAAGGCTGA 
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TGCTATCAAGAAAGTAATCGCAGCTTACCACACAGATGACGTGAAAAAAGTTATCGAAGAATCATCAGA 
TGGTTTGGATCAACCAGTTTGG 

SP020 amino acid (SSQ ZD 110:32) 

NSEKKADNATTIKIATVNRSGSEEKRWDKIQELVKKDGITLEFTEFTDYSQPNKATAIX3EVDLNAFQHY 
NFLNNWNKENGKDLVAIADTYISPIRLYSGLNGSANK^ 

GLIKLDVSGTAIJVWANIKENPKNLKITEU5ASQTARSLSSVDAAVVNNTFVTEAKLD 
EN S KQ WYN 1 I VAKKDWET S P KADA I KKVT AA YHTDDVKKV IEESSDGLDQ P VW 

SP021 nucleotide (SEQ ID NO:33) 

TTCGAAAGGGTCAGAAGGTGCAGACCTTATCAGCATGAAAGGGGATGTCATTACAGAACATCAATTTTA 
TGAGCAAGTGAAAAGCAACCCTTCAGCCCAACAAGTCTTGTTAAATATGACCATCCAAAAAGTTTTTGA 
AAAACAATATGGCTCAGAGCTTGATGATAAAGAGGTTGATGATACTATTGCCGAAGAAAAAAAACAATA 
TGGCGAAAACTACCAACGTGTCTTGTCACAAGCAGGTATGACTCTTGAAACACGTAAAGCTCAAATTCG 
TACAAGTAAATTAGrTGAGTTGGCAGTTAAGAAGGTAGCAGAAGCTGAATTGACAGATGAAGCCTATAA 
GAAAGCCTTTGATGAGTACACTCCAGATGTAACGGCTCAAATCATCCGTCTTAATAATGAAGATAAGGC 
CAAAGAAGTTCTCGAAAAAGCCAAGGCAGAAGGTGCTGATTTTGCTCAATTAGCCAAAGATAATTCAAC 
TGATG AAAAAAC AAAAG AAAATGGTGG AG AAATT ACC TTTG ATTCTGCTTC AAC AG AAGT AC C TGG AGC 
AAGTCCAAAAAAGCCGCTTTTCGCTTTTAGATGTGGGATGGTGTTTCTGGATGTGGATTACAGCAACTG 

GGGCACACCAAGCCTACAG 

SP021 amino acid (SEQ ID NO:34) 

SKGSEGADLISMKGDVITEHQFYEQVKSNPSAQQVLLNMTIQKVFEKQYGSELDDKEVDDTIAEEKKQY 
G ENYQ R VLS Q AGMT L ETRKAQ I RT S KL VE LA VK KV AEAE LTD EA YKKAF D E YT P D VT AQ 1 1 R LNNE D KA 
KEVLEKAKAEGADFAQLAKDNSTDEKTKENGGEITFDSASTEVPGASPKKPLFAFRCGMVFLDVDY5NW 

GTPSLQ 

SP022 nucleotide (SEQ ID 110:35) 

GGGGATGGCAGCTTTTAAAAATCCTAACAATCAATACAAAGCTATTACAATTGCTCAAACTCTAGGTGA 
TG ATGCTTC TTC AG AGG AATTGGCTGGT AG AT ATGGTTCTGCTGTTC AG TG T AC AG AAGTG AC TG C C TC 
AAACCTTTCAACAGTTAAAACTAAAGCTACGGTTCTAGAAAAACCACTGAAAGATTTTAGAGCGTCTAC 
GTCTGATCAGTCTGGTTGGGTGGAATCTAATGGTAAATGGTATTTCTATGAGTCTGGTGATGTGAAGAC 
AGGTTGGGTGAAAACAGATGGTAAATGGTACTATTTGAATGACTTAGGTGTCATGCAGACTGGATTTGT 
AAAATTTT C TGGT AGCTGGTATT AC TTG AGC AATTC AGGTGC T ATG TTT AC AGGCTGGGG AAC AG ATGG 
TAGCAGATGGTTCTACTTTGACGGCTCAGGAGCTATGAAGACAGGCTGGTACAAGGAAAATGGCACTTG 
GTATTACCTTGACGAAGCAGGTATCATGAAGACAGGTTGGTTTAAAGTCGGACCACACTGGTACTATGC 
CTACGGTTCAGGAGCTTTGGCTGTGAGCACAACAACACCAGATGGTTACCGTGTAAATGGTAATGGTGA 

ATGGGTAAAC 

SP022 amino acid (SEQ ID NO:36) 

GMAAFKNPNNQYKAITIAQTLGDDASSEELAGRYGSAVQCTEVTASNLSTVKTKATVVEKPLKDFRAST 
S DQ SGWVE SNGKWYF YE SGDVKTGWVKTDG KWYY LND LGVMQTG FVKF SG S WY Y L S N S G AMF TG WGTDG 
SRWFYFDGSGAMKTGOTKENGTWYYLDEAGIMKTGWFKVGPHVTTO 
WVN 

SP023 nucleotide (SEQ ID NO:37) 

AGACGAGCAAAAAATTAAGCAAGCAGAAGCGGAAGTTGAGAGTAAACAAGCTGAGGCTACAAGGTTAAA 
AAAAATCAAGACAGATCGTGAAGAAGCAGAAGAAGAAGCTAAACGAAGAGCAGATGCTAAAGAGCAAGG 
TAAACCAAAGGGGCGGGCAAAACGAGGAGTTCCTGGAGAGCTAGCAACACCTGATAAAAAAGAAAATGA 
TGCG AAGTCTTC AG ATTCT AGCGT AGGTG AAGAAAC TCTTC C AAGC CC ATC C C TG AAACC AG AAAAAAA 
GGTAGCAGAAGCTGAGAAGAAGGTTGAAGAAGCTAAGAAAAAAGCCGAGGATCAAAAAGAAGAAGATCG 
CCGTAACTACCCAACCAATACTTACAAAACGCTTGAACTTGAAATTGCTGAGTCCGATGTGGAAGTTAA 
AAAAGCGGAGCTTGAACTAGTAAAAGAGGAAGCTAAGGAACCTCGAAACGAGGAAAAAGTTAAGCAAGC 
AAAAGCGGAAGTTGAGAGTAAAAAAGCTGAGGCTACAAGGTTAGAAAAAATCAAGACAGATCGTAAAAA 
AGCAGAAGAAGAAGCTAAACGAAAAGCAGCAGAAGAAGATAAAGTTAAAGAAAAACCAGCTGAACAACC 
ACAACCAGCGCCGGCTCCAAAAGCAGAAAAACCAGCTCCAGCTCCAAAACCAGAGAATCCAGCTGAACA 
ACCAAAAGCAGAAAAACCAGCTGATCAACAAGCTGAAGAAGACTATGCTCGTAGATCAGAAGAAGAATA 
TAATCGCTTGACTCAACAGCAACCGCCAAAAACTGAAAAACCAGCACAACCATCTACTCCAAAAACAGG 
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CTGG AAAC AAG AAAAC GGT ATGTGGT ACTTCT AC AAT ACTG ATGG TT C AATGGC G AC AGG ATGGCTC C A 
AAACAATGGCTCATGGTACTACCTCAACAGCAATGGCGCTATGGCGACAGGATGGCTCCAAAACAATGG 
TTC ATGGT AC T ATCTAAACGC TAATGG TTC AATGGC AAC AGG ATGGC T CC AAAAC AATGGTTC ATGGT A 
CTACCTAAACGCTAATGGTTCAATGGCGACAGGATGGCTCCAATACAATGGCTCATGGTACTACCTAAA 
CGCTAATGGTTCAATGGCGACAGGATGGCTCCAATACAATGGCTCATGGTACTACCTAAACGCTAATGG 
TGATATGGCGACAGGTTGGGTGAAAGATGGAGATACCTGGTACTATCTTGAAGCATCAGGTGCTATGAA 
AGCAAGCCAATGG7TCAAAGTATCAGATAAATGGTACTATGTCAATGGCTCAGGTGCCCTTGCAGTCAA 
CACAACTGTAGATGGCTATGGAGTCAATGCCAATGGTGAATGGGTAAAC 

SP023 amino acid (SEQ ID NO: 38) 

DEQKIKQAEAETV^SKQAEATRLKKIKTDREEAEEEAKRRADAKEQGKPKGRAKRGVPGELATPDKKEND 

AKSSDSSVGEETLPSPSLKPEKKVAEAEKKVEEAKKKAEDQKEEDRRNYPTNTYKTLELEIAESDVEVK 

KAELELVKEEAKEPRNEEKVXQAKAEVESKKAEATRLEKIKTDRKXAEEE1AKRKAAEEDKVXEKPAEQ 

QPAPAPKAEKPAPAPKPENPAEQPKAEKPADQQAEEDYARRSEEEYNRLTQQQPPKTEKPAQPSTPKTG 

WKQENGMWYFYNTI>3SMATGWLQNNGSOT 

YUJANGSMATGWLQYNGSWYYUVANGSMATGWLQ™^ 

AS QWF KVS D KWVTWGSG ALA VNTTVDG YGVN ANG EWVN 

5P025 nucleotide (SEQ ID MO: 39) 

CTGTGGTGAGGAAGAAACTAAAAAGACTCAAGCAGCACAACAGCCAAAACAACAAACGACTGTACAACA 
AATTGCTGTTGGAAAAGATGCTCCAGACTTCACATTGCAATCCATGGATGGCAAAGAAGTTAAGTTATC 
TGATTTTAAGGGTAAAAAGGTTTACTTGAAGTTTTGGGCTTCATGGTGTGGTCCATGCAAGAAAAGTAT 
GCCAGAGTTGATGGAACTAGCGGCGAAACCAGATCGTGATTTCGAAATTCTTACTGTCATTGCACCAGG 
AATTCAAGGTGAAAAAACTGTTGAGCAATTCCCACAATGGTTCCAGGAACAAGGATATAAGGATATCCC 
AGTTCTTTATGATACCAAAGC.^ACCACTTCCAAGCTTATCAAATTCGAAGCATTCCTACAGAATATT 

SP025 amino acid (SEQ ID NO:40) 

CGEEETKKTQAAQQPKQQTWQQIAVGKDAPDFTLQSMDGKEVKLSDFKGKKVYLKFWASWCGPCKKSM 
PELMEI^KPDRDFEILTVIAPGIQGEKTVEQFPQWFQEQGYKDIPVLYDTKATTSKLIKFEAFLQNI 

SP028 nucleotide (SEQ ID NO:41) 

GACTTTTAACAATAAAACTATTGAAGAGTTGCACAATCTCCTTGTCTCTAAGGAAATTTCTGCAACAGA 
ATTGACCCAAGCAACACTTGAAAATATCAAGTCTCGTGAGGAAGCCCTCAATTCATTTGTCACCATCGC 
TGAGGAGCAAGCTCTTGTTCAAGCTAAAGCCATTGATGAAGCtGGAATTGATGCTGACAATGTCCTTTC 
AGGAATTCCACTTGCTGTTAAGGATAACATCTCTACAGACGGTATTCTCACAACTGCTGCCTCAAAAAT 
GCTCTACAACTATGAGCCAATCTTTGATGCGACagCTgTTGCCAATGCAAAAACCAAGGGCATGATTGT 
C GTTGG AAAG AC C AAC ATGG AC G AATTTGC T ATGGGTGGTTC AGG t G AAAC TTC AC ACT ACGG AG C AAC 
TAAAAACGCTTGGAACCACAGCAAGGTTCCTGGTGGGTCATCAAGTGGTTCTGCCGCAGCTGTAGCCTC 
AGGACAAGTTCGCTTGTCACTTGGTTCTGATACTGGTGGTTCCATCCGCCAACCTGCTGCCTTCAACGG 
AAT CGTTGGTCTC AAAC CAACCT ACGG AAC AGTTTCACGTTTCGGTCTCATTGCCTTTGGTAGCTC ATT 
AG ACC AG ATTGG AC CTTTTGCTCC T ACTGTT AAGG AAAATGC C CTCTTGC TC AAC GCT ATTGC C AGC G A 
AGATGCTAAAGACTCTACTTCTGCTCCTGTCCGCATCGCCGACTTTACTTCAAAAATCGGCCAAGACAT 
C AAGGGT ATG AAAATCGC TTTGCCT AAGG AAT AC C TAGGCG AAGG AATTGATC C AG AGGTT AAGG AAAC 
AATCTTAAACGCGGCCAAACACTTTGAAAAATTGGGTGCTATCGTCGAAGAAGTCAGCCTTCCTCACTC 
TAAATACGGTGTTGCCGTTTATTACATCATCGCTTCATCAGAAGCTTCATCAAACTTGCAACGCTTCGA 
CGG T ATC CGTT AC GGCT ATCGC GC AG AAG ATGC AAC C AAC CTTG ATG AAATC TATGT AAAC AGC CG AAG 
CCAAGGTTTTGGTGAAGAGGTAAAACGTCGTATCATGCTGGGTACTTTCAGTCTTTCATCAGGTTACTA 
TGATGCCTACTACAAAAAGGCTGGTCAAGTCCGTACCCTCATCATTCAAGATTTCGAAAAAGTCTTCGC 
GGATTACGATTTGATTTTGGGTCCAACTGCTCCAAGTGTTGCCTATGACTTGGATTCTCTCAACCATGA 
CCCAGTTGCCATGTACTTAGCCGACCTATTGACCATACCTGTAAACTTGGCAGGACTGCCTGGAATTTC 
G ATTC C TGCTGG ATTCTCTC AAGGTCT AC CTGTCGG AC TC CAATTG ATTGG TC C C AAGT ACTCTGAGGA 
AACC ATTT AC C AAGCTGCTGCTGC TTTTG AAGC AAC AAC AG ACT AC C AC AAAC AAC AACC CGTG ATTTT 
TGG AGG TG AC AAC 

SP028 amino acid (SEQ ID NO: 42) 

TFN^TIEELHNLLVSKEISATELTQATLENIKSREEALNSFVTIAEEQALVQAKAIDEAGIDADNVLS 
GIPLAVKDNISTDGILTTAASKMLYNYEPIFDATAVANAKTKGMIWGKTNMDEFAMGGSGETSHYGAT 
KNAWNHSKVPGGSSSGSAAAVASGQVRLSLGSDTGGSIRQPAAFNGIVGLKPTYGTVSRFGLIAFGSSL 
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DQIGPFAPTVKENALLLNAIASEDAKDSTSAPVRIADFTSKIGQDIKGMKIALPKEYLGEGIDPEVKET 
ILNAAKHFEKLGAIVEEVSLPHSKYGVAVYYIIASSEASSNLQRFIXSIRYGYRAEDATNLDEiyVNSRS 
QGFGEEVTCRRIMIXnTSLSSGYYDAYYKKAGQVRTLIIQDFEKVFADYD^ 

PVAMYLADLLTIPVNLAGLPGISIPAGFSQGLPVGLQLIGPKYSEETIYQAAAAFEATTDYHKQQPVIF 
GGDN 

SP030 nucleotide (SEQ ZD NO:43) 

CTTTACAGGTAAACAACTACAAGTCGGCGACAAGGCGCTTGATTTTTCTCTTACTACAACAGATCTTTC 
TAAAAAATCTCTGGCTGATTTTGATGGCAAGAAAAAAGTCTTGAGTGTCGTTC 

CATCTGCTCAACTCAAACACGTCGTTTTAATGAAGAATTGGCTGGACTGGACAACACGGTCGTATTGAC 
TGTTTCAATGGACCTACCTTTTGCTCAAAAACGTTGGTGCGGTGCTGAAGGCCTTGACAATGCCATTAT 
GCTTTCAGACTACTTTGACCATTCTTTCGGGCGCGATTATGCCCTCTTGATCAACGAATGGCACCTATT 
AGCACGCGCAGTCTTTGTCCTCGATACTGACAATACGATTCGCTACGTTGAATACGTGGATAATATCAA 
TTCTGAGCCAAACTTCGAA 

5P030 amino acid (SEQ ID 110:44) 

FTGKQLQVGDKALDFSLTTTDLSKKSLADFDGKKKVLSWPSIDTGICSTQTRRFNEEIAGLDNTVVLT 
VSMDLPFAQKRWCGAEGLDNAIMLSDYFDHSFGRDYALLINE^LLARAVFVLDTDNTIRYVEYVDNIN 
SEPNFE 

SP031 nucleotide (SEQ ZD NO:45) 

CCAGGCTGATACAAGTATCGCAGACATTCAAAAAAGAGGCGAACTGGTTGTCGGTGTCAAACAAGACGT 
TCCCAATTTTGGTTACAAnGATCCCAAGACCGGTACTTATTCTGGTATCGAAaCCGACTTGGCCAAGAT 
GGTAGCTGATGAACTCAAGGTCAAGATTCGCTATGTGCCGGTTACAGCACAAACCCGCGGCCCCCTTCT 
AGACAATGAACAGGTCGATATGGATATCGCGACCTTTACCATCACGGACGAACGCAAAAAACTCTACAA 
CTTTACCAGTCCCTACTACACAGACGCTTCTGGATTTTTGGTCAATAAATCTGCCAAAATCAAAAAGAT 
TGAGGACCTAAACGGCAAAACCATCGGAGTCGCCCAAGGTTCTATCACCCAACGCCTGATTACTGAACT 
GGGTAAAAAGAAAGGTCTGAAGTTTAAATTCGTCGAACTTGGTTCCTACCCAGAATTGATTACTTCCCT 
GCACGCTCATCGTATCGATACCTTTTCCGTTGACCGCTCTATTCTATCTGGCTACACTAGTAAACGGAC 
AGC ACT ACT AG ATG AT AGTTTC AAGC C ATC TG ACT ACGGT ATTGTT AC C AAG AAATC AAAT AC AG AGCT 
CAACGACTATCTTGATAACTTGGTTACTAAATGGAGCAAGGATGGTAGTTTGCAGAAACTTTATGACCG 
TT AC AAGC TC AAAC C ATCT AGC C AT ACTGC AG AT 

SP031 amino acid (SEQ ZD NO: 46) 

QADTSIADIQKRGELWGVKQDVPNFGYXDPKTGTYSGIE7DLAKMVADELKVKIRYVPVTAQTRGPLL 
DNEQVDMDIATFTITDERKKLYNFTSPYYTDASGFLVNKSAKIKKIEDLNGKTIGVAQGSITQRLITEL 
GKKKGLKFKFVELGSYPELITSLKAHRIDTFSVDRSILSGYTSKRTALLDDSFKPSDYGIVTKKSNTEL 
NDYLDNLVTKWSKDGSLQKLYDRYKLKPSSHTAD 

SP032 nucleotide (SEQ ZD NO: 47) 

GTCTGT ATC ATTTGAAAACAAAG AAAC AAAC CGTGGTGTCTTgACTTTC AC TATCTCTC AAG ACC AAAT 
C AAACC AGAATTGG AC C GTGTCTTC AAG t C AGTG AAG AAATCTCTT AATGTTC C AGGTTTC C GT AAAGG 
TC AC CTTC C ACGC C CT ATCTTC G ACC AAAAATTTGGTGAAG AAGCTC TTT ATC AAG ATGC AATG AACGC 
ACTTTTGC CAAACGCTT ATG AAG C AGC TGTAAAAG AAGC TGGT C TTG AAGTGGTTGC CC AAC C AAAAAT 
TG ACGT AACTTC AATGG AAAAAGGTC AAG ACTGGGTT ATC ACTGC TGAAGTC GTT AC AAAAC C TG AAGT 
AAAATTGGGTGACTACAAAAACCTTGAAGTATCAGTTGATGTAGAAAAAGAAGTAACTGACGCTGATGT 
CG AAG AGCGTATCGAACGCG AACGC AAC AACCTGGCTGAATTGGTT ATC AAGGAAGCTGCTGCTGAAAA 
CGGCGACACTGTTGTGATCGACTTCGTTGGTTCTATCGACGGTGTTGAATTTGACGGTGGAAAAGGTGA 
AAACTTCTCACTTGGACTTGGTTCAGGTCAATTCATCCCTGGTTTCGAAGACCAATTGGTAGGTCACTC 
AGCTGGCGAAACCGTTGATGTTATCGTAACATTCCCAGAAGACTACCAAGCAGAAGACCTTGCAGGTAA 
AGAAGCTAAATTCGTGACAACTATCCACGAAGTAAAAGCTAAAGAAGTTCCGGCTCTTGACGATGAACT 
TGC AAAAG AC ATTG ATG AAGAAGTTG AAAC ACTTGCTGACTTGAAAG AAAAAT AC AGC AAAGAATTGGC 
TGCTGCTAAAGAAGAAGCTTACAAAGATGCAGTTGAAGGTGCAGCAATTGATACAGCTGTAGAAAATGC 
TGAAATCGTAGAACTTCCAGAAGAAATGATCCATGAAGAAGTTCACCGTTCAGTAAATGAATTCCTTGG 
GAATTTGCAACGTCAAGGGATCAACCCTGACATGTACTTCCAAATCACTGGAACTACTCAAGAAGACCT 
TC AC AAC C AAT AC C AAGC AG AAG C TGAGTC ACGT ACT AAG ACT AACCTTGTTATCGAAGCAGTTGCCAA 
AGCTGAAGGATTTGATGCTTCAGAAGAAGAAATCCAAAAAGAAGTTGAGCAATTGGCAGCAGACTACAA 
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C ATGG AAGTTGC AC AAGTTC AAAAC TTGC TTTC AGCTG AC ATGTTG AAAC ATG A TATC ACT ATC AAAAA 
AGCTGTTGAATTGATCACAAGCACAGCAACAGTAAAA 

SP032 amino acid (SEQ ID NO: 48) 

SVSFENKETNRGVLTFTISQDQIKPELDRVFKSVKKSLNVPGFRKGHLPRPIFDQKFGEEALYQDAMNA 
LLPNAYEAAVTCEAGLEWAQPKIDVTSMEKGQDWVITAEVVTKPEVKI^ 
EERIERERNNIJ^ELVIKEAAAENGDTWIDFVGSIDGVEFDGGKGENFSLGI^ 
AGETVDVIVTFPEDYQAEDLAGKEAKFVTTIHEVKAKEVPALDDEL^ 

AAKEEAYKDAVEGAAIDTAVENAEIVELPEEMIHEEVHRSVNEFLGNLQRQGINPDMYFQITGTTQEDL 
HNQYQAEAESRTKTNLVIEAVAKAEGFDASEEEIQKEVEQLAADYNMEVAQVQNLLSADMLKHDITIKK 
AVELITSTATVK 

SP033 nucleotide { SEQ ID NO:49) 

TGGTCAAAAGGAAAGTCAGACAGGAAAGGGGATGAAAATTGTGACCAGTTTTTATCCTATCTACGCTAT 
GGTTAAGGAAGTATCTGGTGACTTGAATGATGTTCGGATGATTCAGTCAAGTAGTGGTATTCACTCCTT 
TG AAC CTTCGGC AAATG AT ATCGC AGCC ATC T ATG ATGC AG ATGTCTTTGTTT AC C ATTCTC AT AC AC T 
C GAATC TTGGGC AGG AAGTC TGG ATC C AAATCT AAAAAAATC C AAAGTG AAGGTC TT AGAGGCTTC TG A 
GGGAATGACCTTGGAACGTGTCCCTGGACTAGAGGATGTGGAAGCAGGGGATGGAGTTGATGAAAAAAC 
GCTCTATGACCCTCACACATGGCTAGATCCTGAAAAAGCTGGAGAAGAAGCCCAAATTATCGCTGATAA 
ACTTTCAGAGGTGGATAGTGAGCATAAAGAGACTTATCAAAAAAATGCGCAACCTTTATCAAAAAAGCT 
CAGGAAT 

SP033 amino acid (SEQ ID NO:50) 

GQKESQTGKGMKIVTSFYPIYAMVKEVSGDLJNroVRMIQSSSGIHSFEPSAN^ 

ESWAGSLDPNLKXSKyKVLEASEGMTLERVPGLEDVEAGDGVDEKTLYDPHTW^ 

LSEVDSEHKETYQKNAQPLSKKLRN 

SP034 nucleotide (SEQ ID NO: 51) 

GAAGG AT AG AT AT ATTTT AGC ATTTG AG AC ATCCTGTG ATG AG ACC AGTGT CG CCGTCTTG AAAAAC G A 
CGATGAGCTCTTGTCCAATGTCATTGCTAGTCAAATTGAGAGTCACAAACGTTTTGGTGGCGTAGTGCC 
C G AAGT AGC C AGTCGTC AC C ATGTC G AGGTC ATT AC AGC CTGT ATCG AGG AGGC ATTGGC AG AAGC AGG 
GATTACCGAAGAGGACGTGACAGCTGTTGCGGTTACCTACGGACCAGGCTTGGTCGGAGCCTTGCTAGT 
TGGTTTGTCAGCTGCCAAGGCCTTTGCTTGGGCTCACGGACTTCCACTGATTCCTGTTAATCACATGGC 
TGGGCACCTCATGGCAGCTCAGAGTGTGGAGCCTTTGGAGTTTCCCTTGCTAGCCCTCTTGGTCAGCGG 
CGGACACACAGAGTTGGTTTATGTTTCGGAGGCAGGAGATTATAAGATTGTTGGGGAAACCCGTGATGA 
TGCGGTTGGTGAGGCTTATGATAAGGTCGGCCGTGTCATGGGCTTGACCTATCCTGCAGGTCGTGAGAT 
TGACGAGCTGGCTCATCAGGGGCAGGATATTTATGATTTCCCCCGTGCCATGATTAAGGAAGATAATCT 
GGAGTTCTCCTTCTCAGGTTTGAAATCTGCCTTTATCAATCTTCATCACAATGCCGAGCAAAAGGGAGA 
AAGCCTGTCT AC AGAAGATTTGTGTGCTTCCTTCCAAGC AGC AGTTATGG AC ATTCTC ATGGC AAAAAC 
CAAGAAGGCTTTGGAGAAATATCCTGTTAAAATCCTAGTTGTGGCAGGTGGTGTGGCAGCCAATAAAGG 
TCTCAGAGAACGCCTAGCAGCCGAAATCACAGATGTCAAGGTTATCATCCCCCCTCTGCGACTCTGCGG 
AG AC AATGC AGGT ATG ATTGCC T ATG C C AGC GTC AGC N AGTGG AAC AAAGAAAACTTCGC AGGCTGGG A 
CCTCAATGCC AAAC CAAGTCTTGCCTTTG AT ACC ATGG AA 

SP034 amino acid (SEQ ID NO:52) 

KDRYILAFETSCDETSVAVLKNDDELLSNVIASQIESHKRFGGWPEVASRHKVEVITACIEEALAEAG 
ITEEDVTAVAVTYGPGLVGALLVGLSAAKAFAWAHGLPLIPVNHMAGHLMAAQSVEPLEFPLLALLVSG 
GHTELVYVSEAGDYKIVGETRDDAVGEAYDKVGRVMGLTYPAGREIDELAHQGQDIYDFPRAMIKEDNL 
EFSFSGLKSAFIKLHHNAEQKGESLSTEDLCASFQAAVMDILMAKTKKALEKYPVKILWAGGVAANKG 
LRER1JUVEITDVKVIIPPLRLCGDNAGMIAYASVSXWNKENFAGWDLNAKPSLAFDTME 

SP035 nucleotide (SEQ ID NO:53) 

GGTAGTTAAAGTTGGTATTAACGGTTTCGGACGTATCGGTCGTCTTGCTTTCCGTCGTATCCAAAACGT 
AGAAGGTGTTGAAGTTACACGCATCAACGACCTTACAGATCCAGTTATGCTTGCACACTTGTTGAAATA 
CGACACAACTCAAGGTCGTTTCGACGGTACTGTTGAAGTTAAAGAAGGTGGATTTGAAGTTAACGGTAA 
ATTCATCAAAGTTTCTGCTGAACGTGATCCAGAACAAATCGACTGGGCTACTGACGGTGTAGAAATCGT 
TCTTGAAGCTACTGGTTTCTTTGCTAAGAAAGAAGCAGCTGAAAAACACCTTAAAGGTGGAGCTAAAAA 
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AGTTGTTATCACTGCTCCTGGTGGAAACGACGTTAAAACAGTTGTATTCAACACTAACCACGACGTTCT 
TGACGGTACTGAAACAGTTATCTCAGGTGCTTCATGTACTACAAACTGCTTGGCTCCAATGGCTAAAGC 
TCTTCAAGACAACTTTGGTGTTGTTGAAGGATTGATGACTACTATCCACGCTTACACTGGTGACCAAAT 
GATCCTTGACGGACCACACCGTGGTGGTGACCTTCGCCGTGCTCGCGCTGGTGCTGCAAACATCGTTCC 
TAACTCAACTGGTGCTGCAAAAGCTATCGGTCTTGTAATCCCAGAATTGAATGGTAAACTTGACGGATC 
TGCACAACGCGTTCCAACTCCAACTGGATCAGTTACTGAATTGGTAGCAGT^CTTCAAAAGAACGTTAC 
TGTTGATGAAGTGAACGCAGCTATGAAAGCAGCTTCAAACGAATCATACGGTTACACAGAAGATCCAAT 
CGTATCTTCAGATATCGTAGGTATGTCTTACGGTTCATTGTTTGACGCAACTCAAACTAAAGTTCTTGA 
CGTTGACGGTAAACAATTGGTTAAAGTTGTATCATGGTACGACAACGAAATGTCATACACTGCACAACT 
TGTTCGTACTCTTGGAATACTTCGCAAAAATTGC 

SP035 amino acid (SEQ ID NO:54) 

VVKVGINGFGRIGRLAFRRIQ^EGVEVTRINDLTDPVMIAHLLKYDTTQGRFDG 

FIKVSAERDPEQIDWATDGVEIVLEATGFFAKKEAAEKHLKGGAKKWITAPGGN^ 

DGTETVISGASCTTNClAPMAKALQDNFGVVEGIJlTTIHAyTGDQMILDGPHRG^ 

NSTGAAKAIGLVIPELNGKLDGSAQRVPTPTGSVTELVAVLEKNVTVDEWAAMKAASNESYGYTEDPI 

VSSDIVGMSYGSLFDATQTKVLDVTOKQLVKWSWYDNEMSYTAQLVRTLGILRKNC 

5P036 nucleotide (SEQ ID NO: 55) 

TTCTT ACG AGTTGGG ACTGT ATC AAGC T AG AAC GGTT AAGG AAAAT AATCGTGTTTC C T AT AT AG ATGG 
AAAACAAGCGACGCAAAAAACGGAGAATTTGACTCCTGATGAGGTTAGCAAGCGTGAAGGAATCAATGC 
TG AGC AAATCGTC ATC AAGAT AAC AG AC C AAGGC T ATGTC ACTTC AC ATGGCG ACC AC T ATC ATT ATT A 
C AATGGT AAGGTTC CTT ATG ACGC T ATC AT C AGTG AAGAATT ACTC ATG AAAG ATC C AAACT AT AAGCT 
AAAAGATGAGGATATTGTTAATGAGGTCAAGGGTGGATATGTTATCAAGGTAGATGGAAAATACTATGT 
TT AC CTT AAGG ATGCJGC C C ACGCGG AT AACGTC CGTAC AAAAG AGG AAATC AATCGAC AAAAAC AAG A 
GCATAGTCAACATCGTGAAGGTGGAACTCCAAGAAACGATGGTGCTGTTGCCTTGGCACGTTCGCAAGG 
ACGCTATACTACAGATGATGGTTATATCTTTAATGCTTCTGATATCATAGAGGATACTGGTGATGCTTA 
TATCGTTCCTCATGGAGATCATTACCATTACATTCCTAAGAATGAGTTATCAGCTAGCGAGTTGGCTGC 
TGCAGAAGCCTTCCTATCTGGTCGAGGAAATCTGTCAAATTCAAGAACCTATCGCCGACAAAATAGCGA 
T AAC ACTTC AAG AAC AAACTGGGT AC C TT CTGTAAGC AATCC AGG AACT AC AAAT AC T AAC AC AAGC AA 
CAACAGCAACACTAACAGTCAAGCAAGTCAAAGTAATGACATTGATAGTCTCTTGAAACAGCTCTACAA 
ACTGCCTTTGAGTCAACGACATGTAGAATCTGATGGCCTTGTCTTTGATCCAGCACAAATCACAAGTCG 
AACAGCTAGAGGTGTTGCAGTGCCACACGGAGATCATTACCACTTCATCCCTTACTCTCAAATGTCTGA 
ATTGGAAGAACGAATCGCTCGTATTATTCCCCTTCGTTATCGTTCAAACCATTGGGTACCAGATTCAAG 
GCCAGAACAACCAAGTCCACAACCGACTCCGGAACCTAGTCCAGGCCCGCAACCTGCACCAAATCTTAA 
AATAGACTCAAATTCTTCTTTGGTTAGTCAGCTGGTACGAAAAGTTGGGGAAGGATATGTATTCGAAGA 
AAAGGGCATCTCTCGTTATGTCTTTGCGAAAGATTTACCATCTGAAACTGTTAAAAATCTTGAAAGCAA 
GTTATCAAAACAAGAGAGTGTTTCACACACTTTAACTGCTAAAAAAGAAAATGTTGCTCCTCGTGACCA 
AG AATTTT ATG AT AAAGC AT AT AATCTG TT AACTG AGGCTC AT AAAG C C TTGTTTGN AAAT AAGGGTCG 
T AATTCTG ATTTCCAAGC C TT AG AC AAATT ATT AGAACGC TTG AATG ATG AATCG ACT AAT AAAG AAAA 
ATTGGTAGATGATTTATTGGCATTCCTAGCACCAATTACCCATCCAGAGCGACTTGGCAAACCAAATTC 
TCAAATTGAGTATACTGAAGACGAAGTTCGTATTGCTCAATTAGCTGATAAGTATACAACGTCAGATGG 
TTACATTTTTGATGAACATGATATAATCAGTGATGAAGGAGATGCATATGTAACGCCTCATATGGGCCA 
TAGTCACTGGATTGGAAAAGATAGCCTTTCTGATAAGGAAAAAGTTGCAGCTCAAGCCTATACTAAAGA 
AAAAGGTATCCTACCTCCATCTCCAGACGCAGATGTT AAAGC AAATCCAACTGGAGATAGTGCAGCAGC 
TATTTACAATCGTGTGAAAGGGGAAAAACGAATTCCACTCGTTCGACTTCCATATATGGTTGAGCATAC 
AGTTGAGGT^AAAAACGGTAATTTGATTATTCCTCATAAGGATCATTACCATAATATTAAATTTGCTTG 
GTTTGATGATCACACATACAAAGCTCCAAATGGCTATACCTTGGAAGATTTGTTTGCGACGATTAAGTA 
CTACGTAGAACACCCTGACGAACGTCCACATTCTAATGATGGATGGGGCAATGCCAGTGAGCATGTGTT 
AGGC AAG AAAG ACC AC AGTG AAG ATC C AAAT AAG AACTTCAAAGCGG ATG AAG AGCCAGTAG AGG AAAC 
ACCTGCTGAGCCAGAAGTCCCTCAAGTAGAGACTGAAAAAGTAGAAGCCCAACTCAAAGAAGCAGAAGT 
TTTGCTTGCGAAAGTAACGGATTCTAGTCTGAAAGCCAATGCAACAGAAACTCTAGCTGGTTTACGAAA 
TAATTTGACTCTTCAAATTATGGATAACAATAGTATCATGGCAGAAGCAGAAAAATTACTTGCGTTGTT 
AAAAGGAAGTAATCCTTCATCTGTAAGTAAGGAAAAAATAAAC 

SP036 amino acid ( SEQ ID NO:S6) 

SYELGLYQART\r<ENNRVSYIDGKQATQKTENLTPDEVSKREGINAEQIVIKITDQGYVTSHGDHYHYY 
NGKVPYDAIISEELLMKDPNYKLKDEDIWEVKGGWIKVDGKYYVYLKDAAHADNTO 
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HSQHREGGTPRNMAVALARSQGRYTTDDGYIFNASDIIEEyTGDAYIVPHGDHYHYIPKNELSASELAA 

AEAFLSGRGNLSNSRTYRRQNSDNTSRTNWPSVSNPGTTNTNTSNNSNTNSQASQSNDIDSLLKQLYK 

LPLSQRHVESDGLVFDPAQITSRTARGVAVPHGDHYHFIPYSQMSELEERIARIIPLRYRSNHWPDSR 

PEQPSPQPTPEPSPGPQPAPNLKIDSNSSLVSQI.VRKVGEGYWEEKGISRYWAKDLPSETVTOILESK 

LSKQESVSHTLTAKKENVAPRDQEFYDKAYNLLTEAHKALFXNKGRNSDFQALDKLLERLNDESTNKEK 

LVDDLLAFIAPITHPERI^KPNSQIEYTEDEVRIAQLADKYTTSDGYIFDEHDIISDEGDAYVTPHMGH 

SHWIGKDSLSDKEKVAAQAYTKEKGILPPSPDADVKANPTGDSAAAIYNRVKGEKRIPLVRLPYMVEKT 

VEVKNGNL 1 1 P HKDHYHNI KF AWFDDHTYKAPNG YTLEDLF AT I KYYVEHPDERPHS NDGWGNASEHVL 

GKKDHSEDPNKNFKADEEPVEETPAEPEVPQVETEKVEAQLKEAEVLI^ 

NLTLQ I MDNN S IMAEAEKLLALLKGSNPS SVSKEKIN 

SP038 nucleotide (SEQ ID NO: 57) 

TACTGAGATGCATCATAATCTAGGAGCTGAAAAGCGTTCAGCAGTGGCTACTACTATCGATAGTTTTAA 
GGAGCGAAGTCAAAAAGTCAGAGCACTATCTGATCCAAATGTGCGTTTTGTTCCCTTCTTTGGCTCTAG 
TGAATGGCTTCGTTTTGACGGTGCTCATTCTGCGGTATTAGCTGAGAAATACAATCGTTCCTACCGTCC 
TT AT C TTTT AGG AC AGGGGGG AGCTGC ATCGC TT AACC AAT ATTTTGGAATGC AAC AGATGTT AC C AC A 
GCTGGAGAATAAACAAGTTGTGTATGTTATCTCACCTCAGTGGTTCAGTAAAAATGGCTATGATCCAGC 
AGCCTTCCAGCAGTATTTTAATGGAGACCAGTTGACTAGTTTTCTGAAACATCAATCTGGGGATCAGGC 
TAGTCAATATGCAGCGACTCGCTTACTGCAACAGTTCCCAAACGTAGCTATGAAGGACCTGGTTCAGAA 
GTTGGCAAGTAAAGAAGAATTGTCGACAGCAGACAATGAAATGATTGAATTATTGGCTCGTTTTAATGA 
ACGCCAAGCTTCCTTTTTTGGTCAGTTTTCGGTTAGAGGCTATGTTAACTACGATAAGCATGTAGCTAA 
GTATTTAAAAATCTTGCCAGACCAGTTTTCTTATCAGGCAATAGAAGATGTTGTCAAAGCAGATGCTGA 
AAAAAAT AC TT C C AAT AA TGAG ATGGG AATGG AAAATT ATTTCT AT AATG AGC AG ATC AAG AAGG ATTT 
GAAGAAATTAAAGGATTCTCAGAAAAGCTTTACCTATCTCAAGTCGCCAGAGTATAATGNNTTGCAGTT 
GGTTTT AAC AC AGTTTTC T AAATC T AAGGT AAAC C CG ATTTTT ATC ATTC C AC CTGTT AAT AAAAAATG 
G ATG MAC T ATGC TGGTC T ACG AG AGG AT ATGT AC C AAC AAACGGTG C AG AAG ATTC GCT AC C AG TT AG A 
AAGTCAAGGTTTTACCAATATAGCAGATTTTTCTAAGGACGGCGGGGAGCCTTTCTTTATGAAGGACAC 
CATTCACCTTGGTTGGTTGGGTTGGTTGGCTTTTGACAAGGCAGTTGATCCTTTCCTATCCAATCCCAC 
AC C AG C TC CGACTTACCATCTG AATG AGCGCTTTTTC AGC AAAG ATTGGGCG AC TT ATGATGG AG ATGT 
CAAAGAA 

SP038 amino acid (SEQ ID NO: 58) 

TEMHHNLGAEKRSAVATTIDSFKERSQKVRALSDPNVRFVPFFGSSEW 

YLLGQGGAASLNQYFGMQQMLPQLENKQVVYVISPQWFSKNGYDPAAFQQYFNGDQLTSFLKHQSGDQA 
SQYAATRLLQQFPNVAMKDLVQKLASKEELSTAD^^ 

YLKILPDQFSYQAIEDVVKADAEKNTS^EMGMEOTFYNEQIKKDLKKLKDSQKS 
VLTQFSKSKWPIF IIPPWKKWMXYAGLREDMYQQWQKIRYQLESQ^ 
IKLGWLGWLAFDKAVDPFLSNPTPAPTYHLNERFFSKDWATYDGDVKE 

SP039 nucleotide ( SEQ ID NO:59) 

GG TTTTGAG AAAG T ATTTGC AGGGGGCC C TG ATTG AGTCG ATTG AGC AAGTGG AAAATG AC C GT ATTGT 
GG AAATT AC AGTTTC C AAT AAAAAC G AG ATTGG AG AC C AT ATC C AGGCT AC C TTG ATTATC G AAATT AT 
GGGG AAAC AC AGT AAT ATTCT AC TGGTC GAT AAAAGCAGTC AT AAAATC C T C G AAGTT ATC AAAC AC G T 
CGGCTTTTCACAAAATAGCTACCGCACCTTACTTCCAGGATCGACCTATATCGCTCCGCCAAGTACAAA 
ATCTCTCAATCCTTTTACTATCAAGGATGAAAAGCTCTTTGAAATCCTGCAAACCCAAGAACTAACAGC 
AAAAAATCTTCAAAGCCTCTTTCAAGGTCTGGGACGCGATACGGCAAATGAATTGGAAAGGATACTGGT 
TAGTGAAAAACTTTCCGCTTTCCGAAATTTTTTCAATC AAG AAAC CAAGCC ATGC TTG ACTG AG ACTTC 
CTTC AG TCCAGTTC CTTTTGC AAATC AGGTGGG AG AGC CTTTTGC AAAT CTTTCTG ATTTGTTGG AC AC 
C TACT AT AAGG AT AAGGCTG AGC GCG AC CGCGTC AAAC AGC AGGCCAGTGAACTG ATTC GTCGTGTTGA 
AAATGAACTTCAGAAAAACCGACACAAACTCAAAAAACAGGAAAAAGAGTTACTGGCGACAGACAACGC 
TGAAGAATTTCGTCAAAAAGGAGAATTGCTGACAACCTTCCTCCACCAAGTGCCTAACGACCAAGACCA 
GGTTATCCTAGACAACTACTATACCAACCAACCTATCATGATTGCGCTTGATAAGGCTCTGACTCCCAA 
CCAGAATGC CC AAC GCTATTTT AAAC GGTATC AG AAACTCAAAGAAGCTGTCAAATACTTGACTG ATTT 
GATTGAAGAAACCAAAGCCACTATTCTCTATCTGGAAAGTGTAGAAACCGTCCTCAACCAAGCTGGACT 
GGAAGAAATCGCTGAAATCCGTGAAGAATTGATTCAAACAGGTTTTATCCGCAGAAGACAACGGGAGAA 
AATCCAGAAACGCAAAAAACTAGAACAATATCTAGCAAGCGATGGCAAAACCATCATCTATGTCGGACG 
AAAC AATCTTC AAAATG AGG AP.TTGACCTTTAAAATGGCCCGC AAGG AGG AAC TTTGGTTCCATGCTAA 
GG AC ATTCC TGG AAGC C ATGTTGT CATC TC AGG AAATC TTG AC C C ATCTG ATGC AGTC AAG AC AG ACGC 
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AGCAGAGTTAGCTGCCTACTTCTCTCAAGGGCGCCTGTCGAATCTGGTGCAGGTAGATATGATTGAAGT 
CAAAAAACTCAATAAACCAACTGGTGGAAAACCCGGCTTTGTCACTTACACAGGACAAAAGACCCTCCG 
CGTCACACCAGACTCCAAAAAAATTGCATCCATGAAAAAATCC 

SF039 amino acid (SEQ ID NO:60) 

\^RKYLQGALIESIEQVENDRIVEITOSNKNEIGDHIQATLII£IMGKHSNILLVPKSSHKILEVIKHV 

GFSQNSYRTLLPGSTYIAPPSTKSLNPFTIKDEKLFEILQTQELTAKNLQSLFQGLGRDTANELERILV 

SEKLSAFRNFTOQETKPCLTETSFSPVPFANQVGEPFANLSDLLDTYYKDKAERDRVKQQASELIRRVE 

NEI^KNRHKLKKQEKELLATDNAEEFRQKGELLTTFUiQVPNDQDQVILDNYYTNQPIMIAL 

QNAQRYFKRYQKLKEAVKYLTDLIEETKATILYLESVETVLNQAGLEEIAEIREELIQTGFIRRRQREK 

IQKRKKLEQYLASDGKTIIWGRNNLQ^ELTFKMARKEELWFHAKDIPGSHWISGNLDPSDAVKTDA 

AELAAYFS(^RLSNLVQVDMIETOKLNKPTGGKPGFWYTGQKTLRVTPDSKKIASMKKS 

SP040 nucleotide ( SEQ ID NO: 61) 

GACAACATTTACTATCCATACAGTAGAGTCAGCACCAGCAGAAGTGAAAGAAATTCTTGAAACAGTAGA 
AAAAGACAACAATGGCTATATTCCCAACCTAATCGGTCTCTTGGCCAATGCCCCGACTGTTTTAGAAGC 
CTACCAAATTGTCTCATCTATCCACCGTCGCAACAGCCTGACACCCGTTGAGCGTGAAGTGGTGCAAAT 
CACGGCAGCCGTGACCAATGGTTGTGCCTTCTGTGTCGCAGGTCACACAGCCTTTTCCATCAAACAAAT 
CCAGATGAATGATGACTTGATTCAAGCTCTTCGCAATCGTACTCCAATTGAAACAGATCCTAAATTGGA 
TACCCTAGCTAAGTTTACCTTGGCAGTTATCAATACCAAGGGTCGTGTAGGAGATGAAGCCTTGTCTGA 
GTTTTTAGAAGCTGGCTACACTCAACAAAATGCCTTGGATGTGGTTTTTGGTGTCAGCCTAGCAATCCT 
CTGTAACTATGCCAACAACTTAGCTAATACACCAATTAATCCAGAATTGCAACCTTATGCC 

SP040 amino acid (SEQ ID NO:62) 

TTFTIHTVESAPAZ^EILETVEKDNNGY I PNLIGLLANAPTVLEAYQ I VSS 

TAAVTNGC AFC VAGHTAFS I KQ I QMNDDL I QALRNRTP I ETDPKLDTLAKFTLAVINTKGRVGDEAL5E 
FLEAGYTQQNALDWFGVSLAILCNYANNLANTPINPELQPYA 

SP041 nucleotide (SEQ ID NO:63) 

GGCTAAGGAAAGAGTGGATGTACTAGCTTATAAACAGGGGTTGTTTGAAACGAGAGAGCAGGCCAAGCG 
AGGTGTG ATGGCTGGC C T AGTCGT AGC AGTC CTT AATGG AGAACGGTTTG AC AAGC C AGG AGAG AAAAT 
TCCAGATGACACCGAATTAAAACTCAAGGGGGAGAAACTCAAGTATGTCAGCCGTGGTGGTTTGAAACT 
GGAAAAGGCCTTGCAGGTCTTTGATTTGTCGGTGGATGGCGCGACTACGATTGATATCGGGGCCTCTAC 
TGGAGGTTTTACCGATGTCATGCTACAGAATAGTGCCAAGTTGGTCTTTGCAGTCGATGTTGGTACCAA 
TC AGTTGGC TTGG AAATT ACGC C AAG AC C C AC G AGTTGTC AGC ATGG AGC AGTTC AATTTCCGC T ATGC 
TG AAAAG AC TG ATTTCG AGC AGG AGC CG AG CTTTGC C AGT ATTG ATG TG AG TTTC ATTTCCC TT AG TCT 
GATTTTGCCAGCCTTGCACCGTGTCTTGGCTGATCAAGGTCAGGTGGTAGCACTTGTCAAACCTCAGTT 
TG AGGC AGG AC G TG AGC AG ATTGGG AAAAATGG AATT ATTCG AG ATGCT AAGGTTC ATC AG AATG T C CT 
TG AATCTGT AAC AGCT ATGGC AG TAG AGGTAGGTTTTTC AGTC C TTGGCTTGG AC TTTTC TC C C AT C C A 
AGGTGGACATGGAAATATTGAATTTTTAGCGTATTTGAAAAAAGAAAAGTCAGCAAGCAATCAGATTCT 
TGCTGAGATTAAAGAAGCAGTAGAGAGGGCGCATAGTCAATTTAAAAATGAA 

SP041 amino acid (SEQ ID NO: 64) 

AKERVDVLAYKOGLFETREQAKRGVMAGLWAVLNGERFDKPGEKIPDDTELKLKGEKLKYVSRGGLKL 
EKALQVFDLSVDGATTIDIGASTGGFTDVMLQNSAKLVFAVDVGTNQIAWKLRQDPRVVSMEQFNFRYA 
EKTDFEQEPSFASIDVSFISLSLILPALHRVIADQGQWALVKPQFEAGREQIGKNGIIRDAKVHQNVL 
ESVTAMAVE^GFS'v^LDFSPIQGGHGNIEFLAYLKKEKSASNQILAEIKEAVERAHSQFKNE 

SP042 nucleotide (SEQ ID NO: 65) 

TTGTTCCTATGAACTTGGTCGTCACCAAGCTGGTCAGGTTAAGAAAGAGTCTAATCGAGTTTCTTATAT 

AGATGGTGATCAGGCTGGTCAAAAGGCAGAAAACTTGACACCAGATGAAGTCAGTAAGAGGGAGGGGAT 

C AACGCC G AAC AAATNGTN ATC AAG ATT AC GG ATC AAGGTT ATGTG AC C TC T C ATGG AG ACC ATT ATC A 

TTACTATAATGGCAAGGTTCCTTATGATGCCATCATCAGTGAAGAGCTCCTCATGAAAGATCCGAATTA 

TCAGTTGAAGGATTCAGACATTGTCAATGAAATCAAGGGTGGTTATGTCATTAAGGTAAACGG^ 

CT ATGTNT AC CTT AAGG ATGC AGC TC ATGCGG AT AATATTCGGAC AAAAG AAG AGATTAAACGTCAGAA 

GCAGGAACGCAGTCATAATCATAACTCAAGAGCAGATAATGCTGTTGCTGCAGCCAGAGCCCAAGGACG 

TTATACAACGGATGATGGGTATATCTTCAATGCATCTGATATCATTGAGGACACGGGTGATGCTTATAT 

CGTTCCTCACGGCGACCATTACCATTACATTCCTAAGAATGAGTTATCAGCTAGCGAGTTAGCTGCTGC 
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AGAAGCCTATTGGAATGGGAAGCAGGGATCTCGTCCTTCTTCAAGTTCTAGTTATAATGCAAATCCAGC 
TCAACCAAGATTGTCAGAGAACCACAATCTGACTGTCACTCCAACTTATCATCAAAATCAAGGGGAAAA 
C ATTTC AAGC CTTTT ACGTG AATTGT ATGCT AAAC C CTT ATC AG AACGCC ATGTGG AATC TG ATGGC CT 
TATTTTCGACCCAGCGCAAATCACAAGTCGAACCGCCAGAGGTGTAGCTGTCCCTCATGGTAACCATTA 
CCACTTTATCCCTTATGAACAAATGTCTGAATTGGAAAAACGAATTGCTCGTATTATTCCCCTTCGTTA 
TCGTTCAAACCATTGGGTACCAGATTCAAGACCAGAACAACCAAGTCCACAATCGACTCCGGAACCTAG 
TCCAAGTCCGCAACCTGCACCAAATCCTCAACCAGCTCCAAGCAATCCAATTGATGAGAAATTGGTCAA 
AGAAGCTGTTCGAAAAGTAGGCGATGGTTATGTCTTTGAGGAGAATGGAGTTTCTCGTTATATCCCAGC 
CAAGGATCTTTCAGCAGAAACAGCAGCAGGCATTGATAGCAAACTGGCCAAGCAGGAAAG7TTATCTCA 
TAAGCTAGGAGCTAAGAAAACTGACCTCCCATCTAGTGATCGAGAATTTTACAATAAGGCTTATGACTT 
ACTAGCAAGAATTCACCAAGATTTACTTGATAATAAAGGTCGACAAGTTGATTTTGAGGCT7TGGATAA 
CCTGTTGGAACGACTCAAGGATGTCNCAAGTGATAAAGTCAAGTTAGTGGANGATATTCTTGCCTTCTT 
AGC TC CG ATTC GTC ATCC AG AAC GTTT AGG AAAAC C AAATGCGC AAATT AC CT AC ACTG ATGATG AG AT 
TCAAGTAGCCAAGTTGGCAGGCAAGTACACAACAGAAGACGGTTATATCTTTGATCCTCGTGATATAAC 
CAGTGATGAGGGGGATGCCTATGTAACTCCACATATGACCCATAGCCACTGGATTAAAAAAGATAGTTT 
GTCTG AAGCTG AG AGAGCGGC AGC C C AGGCTT ATGCT AAAG AG AAAGGTTTG AC CC CTC C TTCG AC AG A 
CCATCAGGATTCAGGAAATACTGAGGCAAAAGGAGCAGAAGCTATCTACAACCGCGTGAAAGCAGCTAA 
GAAGGTGCCACTTGATCGTATGCCTTACAATCTTCAATATACTGTAGAAGTCAAAAACGGTAGTTTAAT 
CATACCTCATTATGACCATTACCATAACATCAAATTTGAGTGGTTTGACGAAGGCCTTTATGAGGCACC 
TAAGGGGTATACTCTTGAGGATCTTTTGGCGACTGTCAAGTACTATGTCGAACATCCAAACGAACGTCC 
GC ATTC AG AT AATGGTTTTGGTAACGCTAGCGACCATGTTC AAAG AAAC AAAAATGGTC AAGCTG AT AC 
CAATCAAACGGAAAAACCAAGCGAGGAGAAACCTCAGACAGAAAAACCTGAGGAAGAAACCCCTCGAGA 
AG AG AAACC GC AAAGCG AG AAAC C AG AG TCTCCAAAACC AAC AG AGG AAC CAGAAGAATCA.CC AG AGG A 
ATC AG AAG AAC C TC AGGTCG AG ACTG AAAAGGTTG AAG AAAAAC TG AG AG AGGCTG AAG ATTT AC TTGG 
AAAAATCCAGGAT , 

SP042 amino acid (SEQ ID NO: 66) 

CSYELGRKQAGQVXKESNRVSYIDGDQAGQKAENLTPDEVSKREGINAEQXVIKITDQGr/TSHGDHYH 

YYNGKVPYDAI I SEELLMKDPNYQLKDSDI VNEIKGGYVI KVNGKYYVYLKDAAHADNIRTKEEI KRQK 

QERSHNHNSRADNAVAAARAQGRYTTDDGYIFNASDIIEDTGDAYIVPHGDHYHYIPKNEL3ASELAAA 

EAYWNGKQGSRPSSSSSYNANPAQPRLSENHNLTVTPTYHQNQGENISSLLRELYAKPLSERHVESDGL 

IFDPAQITSRTARGVAVPHGNHYHFIPYEQMSELEKRIARIIPLRYRSNHWVPDSRPEQPSPQSTPEPS 

PSPQPAPNPQPAPSNPIDEKLVKEAVRKVGDGYVFEENGVSRYIPAKDLSAETAAGIDSKLAKQESLSH 

KLGAKKTDLPSSDREFYNKAYDLLARIHQDLLDNKGRQVDFEALDNLLERLKDVXSDfCVKLVXDILAFL 

APIRHPERLGKPNAQITYTDDEIQVAKIAGKYTTEDGYIrDPRDITSDEGDAYVTPHMTKSHWIJCKDSL 

SEAERAAAQAYAKEKGLTPPSTDHQDSGNTEAKGAEAIYNRVKAAKKVPLDRMPYNLQYT/SNn^ 

IPHYDHYHNIKFEWFDEGLYEAPKGYTLEDLIATVKYYVEHPNERPKSDNGFGNASDHVQRNKNGQADT 

NQTEKPSEEKPQTEKPEEETPREEKPQSEKPESPKPTEEPEESPEESEEPQVETEKVEEKLREAEDLLG 

KIQD 

SP043 nucleotide (SEQ ZD NO:67) 

TT AT AAGGGTG AATT AG AAAAAGG AT AC C AATTTG ATGGTTGGG AAATTTC TGG7TT CG AAGGT AAAAA 
AG ACGCTGGC TATGTT ATT AATC T ATC AAAAGATAC CTTT AT AAAAC C TG T ATTC AAGAAAAT AG AGGA 
G AAAAAGG AGG AAGAAAATAAACCT AC TTTTGATGTATCG AAAAAG AAAG AT AAC C C AC AAGT AAAC C A 
TAGTC AATT AAATGAAAGTC AC AGAAAAG AGG ATTT AC AAAG AG AAG AGC ATTC AC AAAAATCTG ATTC 
AACTAAGGATGTTACAGCTACAGTTCTTGATAAAAACAATATCAGTAGTAAATCAACTACTAACAATCC 
TAATAAG 

SP04 3 amino acid (SEQ ID 170:68) 

YKGELEKGYQFDGWEISGFEGKKDAGWINLSKDTFIKPVFKKIEEKKEEENKPTFDVSKKKDNPQVNH 
SQLNESHRKEDLQREEHSQKSDSTKDVTATVLDKNNISSKSTTNNPNK 

SP044 nucleotide (SEQ ID NO:69) 

GAATGTTCAGGCTCAAGAAAGTTCAGGAAATAAAATCCACTTTATCAATGTTCAAGAAGGTGGCAGTGA 
TGCGATTATTCTTGAAAGCAATGGACATTTTGCCATGGTGGATACAGGAGAAGATTATGATTTCCCAGA 
TGG AAGTG ATTC TC GCT ATC C ATGGAG AG AAGG AATTG AAACGTC TT AT AAGC ATGTTCT AAC AG AC C G 
TGTC TTTCGTC G TTTG AAGGAATTGGGTG TCC AAAAACTTGATTTT ATTTTGGTGAC CC AT ACCC AC AG 
TG ATCAT ATTGG AAATGTTG ATG AATT AC TGTCT AC C T ATC C AGTTG AC C G AGTCT ATC TT AAGAAAT A 
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TAGTGATAGTCGTATTACTAATTCTGAACGTCTATGGGATAATCTGTATGGCTATGATAAGGTTTTACA 
GACTGCTGCAGAAAAAGGTGTTTCAGTTATTCAAAATATCACACAAGGGGATGCTCATTTTCAGTTTGG 
GGACATGGATATTCAGCTCTATAATTATGAAAATGAAACTGATTCATCGGGTGAATTAAAGAAAATTTG 
GGATGACAATTCCAATTCCTTGATTAGCGTGGTGAAAGTCAATGGCAAGAAAATTTACCTTGGGGGCGA 
TTTAGATAATGTTCATGGAGCAGAAGACAAGTATGGTCCTCTCATTGGAAAAGTTGATTTGATGAAGTT 
TAATCATCACCATGATACCAACAAATCAAATACCAAGGATTTCATTAAAAATTTGAGTCCGAGTTTGAT 
TGTTCAAACTTCGGATAGTCTACCTTGGAAAAATGGTGTTGATAGTGAGTATGTTAATTGGCTCAAAGA 
ACGAGGAATTGAGAGAATCAACGCAGCCAGCAAAGACTATGATGCAACAGTTTTTGATATTCGAAAAGA 
CGGTTTTGTCAATATTTCAACATCCTACAAGCCGATTCCAAGTTTTCAAGCTGGTTGGCATAAGAGTGC 
ATATGGGAACTGGTGGTATCAAGCGCCTGATTCTACAGGAGAGTATGCTGTCGGTTGGAATGAAATCGA 
AGGTGAATGGTATTACTTTAACCAAACGGGTATCTTGTTACAGAATCAATGGAAAAAATGGAACAATCA 
TTGGTTCTATTTGACAGACTCTGGTGCTTCTGCTAAAAATTGGAAGAAAATCGCTGGAATCTGGTATTA 
TTTTAACAAAGAAAACCAGATGGAAATTGGTTGGATTCAAGATAAAGAGCAGTGGTATTATTTGGATGT 
TG ATGGTTCT ATG AAG AC AGG ATGGC TTC AAT AT ATGGGGC AATGGT ATT ACTTTGCTCC ATC AGGGG A 
A 

SP044 amino acid (SEQ ID NO: 70) 

NVQAQESSGNKIKFINVQEGGSDAIILESNGHFAMVDTGEDYDFPDGSDSRYPWREGIETSYKHVLTDR 
VFRRLKELGVQKLDFILVTHTKSDHIGNVDELLSTYPVDRVYLKKYSDSRITNSERLWDNLYGYDKVLQ 
TAAEKGVSVIQNITOGDAHFQFGDMDIQL'fTTYENETDSSGELKKIWDDNSNSLISVVKVNGKKIYLGGD 
LDNVHGAEDKYGPLIGKVDLMKF^HHDTNKSNTKDFIK^ 

RGIERINAASKDYDATVFDIRKDGFVNISTSYKPIPSFQAGWHKSAYGNWWYQAPDSTGEYAVGWNEIE 

GEWYYFNOTGILLQNQWKKWNNHWYLTDSGASAKNWKKIAGIWYYFNKENQMEIGWIQ^ 

DGSMKTGWLQYMGQWYYFAPSGE 

SP045 nucleotide (SEQ ID NO:71) 

CTTGGGTGTAACCCATATCCAGCTCCTTCCAGTCTTGTCTTACTACTTTGTCAATGAATTGAAAAACCA 
TGAACGCTTGTCTGACTACGCTTCAAGCAACAGCAACTACAACTGGGGATATGACCCTCAAAACTACTT 
CTCCTTGACTGGTATGTACTCAAGCGATCCTAAGAATCCAGAAAAACGAATCGCAGAATTTAAAAACCT 
C ATC AAC G AAATC C AC AAAC GTGGT ATGGG AGCT ATC C T AG ATGTCGTTT AT AACC AC AC AGCC AAAGT 
CG ATCTCTTTG AAG ATTTGGAACC AAACT ACT ACC AC TTT ATGG ATGC C G ATGGC AC AC CT C GAAC T AG 
CTTTGGTGGTGGACGCTTGGGGACAACCCACCATATGACCAAACGGCTCCTAATTGACTCTATCAAATA 
CCTAGTTGATACCTACAAAGTGGATGGCTTCCGTTTCGATATGATGGGAGACCATGACGCCGCTTCTAT 
CGAAGAAGCTTACAAGGCTGCACGCGCCCTCAATCCAAACCTCATCATGCTTGGTGAAGGTTGGAGAAC 
CTATGCCGGTGATGAAAACATGCCTACTAAAGCTGCTGACCAAGATTGGATGAAACATACCGATACTGT 
CGCTGTCTTTTCAGATGACATCCGTAACAACCTCAAATCTGGTTATCCAAACGAAGGTCAACCTGCCTT 
TATCACAGGTGGCAAGCGTGATGTCAACACCATCTTTAAAAATCTCATTGCTCAACCAACTAACTTTGA 
AGCTGACAGCCCTGGAGATGTCATCCAATACATCGCAGCCCATGATAACTTGACCCTCTTTGACATCAT 
TGCCCAGTCTATCAAAAAAGACCCAAGCAAGGCTGAGAACTATGCTGAAATCCACCGTCGTTTACGACT 
TGGAAATCTCATGGTCTTGACAGCTCAAGGAACTCCATTTATCCACTCCGGTCAGGAATATGGACGTAC 
T AAAC AATTC C GTG ACC C AGC CT AC AAG AC TCC AG T AGC AG AGG AT AAGGTTC C AAAC AAAT CT C ACTT 
GTTGCGTGATAAGGACGGCAACCCATTTGACTATCCTTACTTCATCCATGACTCTTACGATTCTAGTGA 
TGCAGTCAACAAGTTTGACTGGACTAAGGCTACAGATGGTAAAGCTTATCCTGAAAATGTCAAGAGCCG 
TGACTAT ATG AAAGGTTTG ATTGC C C TTCGTC AATCT AC AGATGC CTTC C G AC TT AAG AGTC TTC AAG A 
TATCAAAGACCGTGTCCACCTCATCACTGTCCCAGGCCAAAATGGTGTGGAAAAAGAGGATGTAGTGAT 
TGGCTACCAAATCACTGCTCCAAACGGCGATATCTACGCAGTCTTTGTCAATGCGGATGAAAAAGCTCG 
CGAATTTAATTTGGGAACTGCCTTTGCACATCTAAGAAATGCGGAAGTTTTGGCAGATGAAAACCAAGC 
AGGACCAGTCGGAATTGCCAACCCGAAAGGACTTGAATGGACTGAAAAAGGCTTGAAATTGAATGCCCT 
TACAGCTACTGTTCTTCGAGTCTCTCAAAATGGAACTAGCCATGAGTCAACTGCAGAAGAGAAACCAGA 
CTCAACCCCTTCCAAGCCTGAACATCAAAATGAAGCTTCTCACCCTGCACATCAAGACCCAGCTCCAGA 
AGCT AGAC CTGATTCT AC T AAAC C AG ATGC C AAAGT AGCTG ATGC GG AAAAT AAACCT AGC C AAGC T AC 
AGCTGATTCACAAGCTGAACAACCAGCACAAGAAGCACAAGCATCATCTGTAAAAGAAGCGGTTCGAAA 
CG AATCGGT AG AAAACTCT AGC AAGG AAAAT AT ACCTGC AAC C C C AG AT AAAC AAGCTGAA 

SP045 nucleotide (SEQ ID NO:72) 

LGVTHIQLLPVLSYYFTOELKNHERLSDYASSNS^^Y^JW^^ 

INEIHKRGMGAILDWYNHTAKVDLFEDLEPNYYH^ 

LVDTYKVIXJFRFDMMGDHDAASIEEAYKAARALNPN 
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AVFSDDIRNNLKSGYPNEGQPAFITGGKRDVNTIFKNLIAQPTNFEADSPGDVIQYIAAHDNLTLFDII 
AQSIKKDPSKAENYAEIHRRLRLGNLMV^TAQGTPFIHSGQEYGRTKQFRDPAYKTPVAEDKVPNKSHL 
LRBKIX3NPFDYPYFIHDSYDSSDAVNKFDWTKATDGKAYPENVKSRDYMKGLIALRQSTDAFRLKSLQD 
IKDRVHLITVPGQNGVEKEDWIGYQITAPNGDIYAVFVNADEKAREFNLGTAFAHLRNAEVI^ENQA 
GPVGIANPKGLEWTEKGLKLNALTATVLRVSQNGTSHESTAEEKPDSTPSKPEHQNEASHPAHQDPAPE 
ARPDSTKPDAXVADAENKPSQATADSQAEQ P AQEAQAS SVKEA VRNESVENS S KENI PAT PDKQAE 

SP046 nucleotide (SEQ ID NO:73) 

TAGTGATGGTACTTGGCAAGGAAAACAGTATCTGAAAGAAGATGGCAGTCAAGCAGCAAATGAGTGGGT 

TTTNGATACTCATTATCAATCTTGGTTCTATATAAAAGCAGATGCTAACTATGCTGAAAATGAATGG 

AAAGCAAGGTGACGACTATTTTTACCTCAAATCTGGTGGCTATATGGCCAAATCAGAATGGGTAGAAGA 

CAAGGGAGCCTTTTATTATCTTGACCAAGATGGAAAGATGAAAAGAAATGCTTGGGT^ 

TGTTGGTGCAACAGGTGCCAAAGTAATAGAAGACTGGGTCTATGATTCTCAATACGATGCTTGGTTTTA 

TATCAAAGCAGATGGACAGCACGCAGAGAAAGAATGGCTCCAAATTAAAGGGAAGGACTATTATTTC^ 

ATCCGGTGGTTATCTACTGACAAGTCAGTGGATTAATCAAGCTTATGTGAATGCTAGTGGTGCCAAAGT 

ACAGCAAGGTTGGCTTTTTGACAAACAATACCAATCTTGGTTTTACATCAAAGAAAATGGAAACTATGC 

TG AT AAAG AATGGATTTTCG AG AATGGTC ACT ATT ATT ATC T AAAATC CGG TGGC T AC ATGGC AGC C AA 

TGAATGGATTTGGGATAAGGAATCTTGGTTTTATCTCAAATTTGATGGGAAAATGGCTG 

GGTCTACGATTCTCATAGTCAAGCTTGGTACTACTTCAAATCCGGTGGTTACATGACAGCCAATGAATG 

GATTTGGGATAAGGAATCTTGGTTTTACCTCAAATCTGATGGGAAAATAGCTGAAAAAGAATGGGTCTA 

CGATTCTCATAGTCAAGCTTGGTACTACTTCAAATCTGGTGGCTACATGGCGAAAAATGAGACAGTAGA 

TGGTTATCAGCTTGGAAGCGATGGTAAATGGCTTGGAGGAAAAACTACAAATGAAAATGCTGCTTACTA 

TC AAGT AGTGCCTGTT AC AGCC AATGTTT ATG ATTC AG ATGGTG AAAAGCTTTCCT AT AT ATC G C AAG G 

TAGTGTCGTATGGCTAGATAAGGATAGAAAAAGTGATGACAAGCGCTTGGCTATTACTATTTCTGGTTT 

GTCAGGCTATATGAAAACAGAAGATTTACAAGCGCTAGATGCTAGTAAGGACTTTATCCCTTATTATGA 

GAGTGATGGCCACCGTTTTTATCACTATGTGGCTCAGAATGCTAGTATCCCAGTAGCTTCTCATCTTTC 

TGATATGG AAGT AGGCAAGAAAT ATT ATTCGGCAGATGGCCTGCATTTTGATGGTTTTAAGCTTGAGAA 

TCCCTTC C TTTT C AAAGATTT AAC AGAGGC T AC AAACT AC AGTG CTG AAG AATTGG AT AAGGT ATTT AG 

TTTGCTAAACATTAACAATAGCCTTTTGGAGAACAAGGGCGCTACTTTTAAGGAAGCCGAAGAACATTA 

CCATATCAATGCTCTTTATCTCCTTGCCCATAGTGCCCTAGAAAGTAACTGGGGAAGAAGTAAAATTGC 

C AAAG AT AAG AAT AATTTCTTTGGC ATT AC AGC CT ATGAT ACG ACC C C TT ACCTTTC TGCT AAG AC ATT 

TGATGATGTGGATAAGGGAATTTTAGGTGCAACCAAGTGGATTAAGGAAAATTATATCGATAGGGGAAG 

AACTTTCCTTGGAAACAAGGCTTCTGGTATGAATGTGGAATATGCTTCAGACCCTTATTGGGGCGAAAA 

AATTGC T AGTGTG ATG ATG AAAATC AATG AG AAGCTAGGTGGC AAAG AT 

SP046 amino acid (SEQ ID NO:74) 

SDGTWQGKQYLKEDGSQAANEWVXDTHYQSWFYIKADANYAENEWLKQGDDYFYLKSGGYMAKSEWVED 
KGAFYYLDQDGKMKRNAWVGTSYVGATGAKVIEDWVTO^ 

SGGYLLTSQWINQAYVNASGAKVQQG^^FDKQYQSWFYIKENGNYADKEWIFENGHYYYLKSGGYMAAN 
EWIWDKESWFYLKFDGKMAEKEWVYDSKSQAWYYFKSGGYMTANEWIWDKESWFYLKSDGKIAEKEWW 
DSHSQAWYYFKSGGYMAKNETVIXSYQLGSDGKWLX^KTTNENAAYYQWPV^ 

SWWLDKDRKSDDKRLAITI SGLSGYMKTEDLQALDASKDF I P YYESDGHRFYHYVAQNAS I PVASHLS 
DMEVGKKYYSADGLHFDGFKLENPFLFKDLTEATNYSAEELDKVFSLLNINNSLLENKGATFKEAEEHY 
HINALYLLAHSALESNWGRSKIAKDKNNFFGITAYI>TTPYLSAKTFDDVBKGILGATKWIKEOTIDRGR 
TFLGNKASGMNVEYASDPYWGEKIASVMMKINEKLGGKD 

SP048 nucleotide (SEQ ID NO:75> 

TGGGATTCAATATGTCAGAGATGATACTAGAGATAAAGAAGAGGGAATAGAGTATGATGACGCTGACAA 
TGGGGATATTATTGTAAAAGTAGCGACTAAACCTAAGGTAGTAACCAAGAAAATTTCAAGTACGCGAAT 
TCGTTATGAAAAAGATGAAACAAAAGACCGTAGTGAAAATCCTGTTACAATTGATGGAGAGGATGGCTA 
TGTAACTACGACAAGGACCTACGATGTTAATCCAGAGACTGGTTATGTTACCGAACAGGTTACTGTTGA 
TAGAAAAGAAGCCACGGATACAGTTATCAAAGTTCCAGCTAAAAGCAAGGTTGAAGAAGTTCTTGTTCC 
ATTTGCTACTAAATATGAAGCAGACAATGACCTTTCTGCAGGACAGGAGCAAGAGATTACTCTAGGAAA 
GAATGGGAAAACAGTTACAACGATAACTTATAATGTAGATGGAAAGAGTGGACAAGTAACTGAGAGTAC 
TTTAAGTCAAAAAAAAGACTCtCAAACAAGAGTTGTTAAAAAAAGaACCArkCCCCAAGTTCTTGTCCA 
AGAAATTCCAATCGAAACAGAATATCTCGATGGCCCaACTCTTGATAAAaGTCAAGAAGTAGAAGAAGT 
AGGAGAAATTGGTAAATTACTCTTACTACAATCTATACTGG^AGATGAACGTGATGGAACAATTGAAGA 
AACT AC TTCT CGTC AAATT AC T AAAGAG ATGGT AAAAAG ACGT AT AAGG AG AGGG ACG AG AGAAC CTG A 
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AAAAGTTGTTGTTCCTG AGC AATC ATCT ATTC CTTCGT ATC CTGT ATCTGTT AC ATCT AAC C AAGG AAC 
AGATGTAGCAGTAGAACCAGCTAAAGCAGTTGCTCCAACAACAGACTGGAAACAAGAAAATGGTATGTG 
GTATTTTTATAATACTGATGGTTCCATGGCAACAGGTTGGGTACAAGTTAATAGTTCATGGTACTACCT 
CAACAGCAACGGTTCTATGAAAGTCAATCAATGGTTCCAAGTTGGTGGTAAATGGTATTATGTAAATAC 
ATCGGGTGAGTTAGCGGTCAATACAAGTATAGATGGCTATAGAGTCAATGATAATGGTGAATGGGTGCG 
T 

SP048 amino acid (SEQ ZD NO: 76) 

GIQYVRDDTRDKEEG I EYDDADNGD I 1 VKVATKPKWTKKI S STRI RYEKDETKDRSEN PVT IDG EDGY 

VTTTRTTOVNPETGYVTEQVTVDRKEATDTVIKVPAKSKVEE^VPFATKY 

NGKTVTTITYNVDGKSGQVTESTLSQKKDSQTRVWKRTXPQVL^ 

GEIGKLLLLQSILVDERDGTIEETTSRQITKEMVKRRIRRGTREPEKVVVPEQSSIPSYPVSVTSNQGT 

DVAVEIPAKAVAPTTDWKQENGMWYFYNTDGSMATGWQV^ 

SGELAVNTS IDGYRVNDNGEWVR 

SP049 nucleotide { SEQ ID NO:77) 

GG AT AAT AG AG AAGC ATT AAAAAC CTTTATG ACGGGTG AAAATTTTT ATC TC C AAC ATT ATCT AGG AGC 
AC AT AGGG AAG AACT AAATGG AG AGC ATGGCT AT ACCTTCCGTGTTTGGGC AC C T AATGCTC AGGCTGT 
TCACTTGGT^GGTGATTTTACCAACTGGATTGAAAATCAGATTCCAATGGTAAGAAATGATTTTGGGGT 
CTGGGAAGTCTTTACCAATATGGCTCAAGAAGGGCATATTTACAAATATCATGTCACACGTCAAAATGG 
TCATCAACTGATGAAGATTGACCCTTTTGCTGTCAGGTATGAGGCTCGTCCAGGAACAGGGGCAATCGT 
AACAGAGCTTCCTGAGAAGAAATGGAAGGATGGACTTTGGCTGGCACGAAGAAAACGTTGGGGCTTTGA 
AG AGCG TC CTGTC AAT ATTT ATG AAGTTC AC GCTGG ATC ATGG AAAAG AAATTCTG ATGGC AG TCCTT A 
TAGTTTTGCCCAGCTC AAGG ATGAACTCATTCCTTATCTCGTTGAAATG AACT ATACTCATATTGAGTT 
TATGCCCTTGATGTOCCATCCTTTGGGCTTGAGTTGGGGGTATCAGCTTATGGGTTACTTCGCTTTAGA 
GCATGCTTATGGCCGACCAGAGGAGTTTCAAGATTTTGTC 

SP049 amino acid { SEQ ID NO:78) 

DNREALKTFMTGENFYLQHYLGAHREELNGEHGYTFRVWAPNAQAVHLVGDFTNWIENQIPMVRKDFGV 
WEWTNMAQEGHIYKYHVTRQNGHQLMKIDPFAVRYEARPGTGAIWELPEKKWKTC 
ERPVNIYEVHAGSWKRNSDGSPYSFAQLKDELIPYLVEMNYTHIEFMPLMSHPLGLSWGYQLMGYFALE 
HAYGRPEEFQDFV 

SP050 nucleotide (SEQ ID NO:79) 

AG ATTTTGTC G AGG AG TGTC AT AC C C AT AAT ATTGGGGTT ATTGTGG AC TGGGT AC C AGNTC ACTTT AC 
CATCAACGATGATGCCTTAGCCTATTATGATGGGACACCGACTTTTGAATACCAAGACCATAATAAGGC 
TCATAACCATGGTTGGGGTGCCCTTAATTTTGACCTTGGAAAAAATGAAGTCCAGTCCTTCTTAATTTC 
TTGC ATT AAGC ATTGG ATTG ATGTCT ATC ATTTGG ATGGT ATTCGTG TGG ATGC TGTT AGC AAC ATGCT 
C T ATTTGG AC T ATG ATG ATGC TCC ATGG AC ACCT AAT AAAG ATGGCGG AAATC TC AAC T ATG AAGG TT A 
TTATTTCCTTCAGCGCTTGAATGAGGTTATTAAGTTAGAATATCCAGATGTGATGATGATTGCAGAAGA 
AAGTTCGTCTGCGATCAAGATTACGGGAATGAAAGAGATTGGTGGTCTAGGATTTGACTACAAATGGAA 
CATGGGCTGGATGAATGATATCCTCCGTTTCTACGAAGAAGATCCGATCTATCGTAAATATGACTTTAA 
CCTGGTGACTTTCAGCTTTATGTATGTTTNCAAGGAGAATTATCTCTTGCCATTCTCGCACGATGAAGT 
GGTTCATGGCAAGAAGAGTATGATGCATAAGATGTGGGGAGATCGTTACAATCAATTCGCAGGCTTGCG 
CAATCTCTATACGTACCAAATTTGTCACCCTGGTAAGAAATTGCTCTTCATGGGTAGCGAATACGGTCA 
ATTCC TAG AATGG AAATC TG AAG AAC AGTTGG AATGGTC T AAC CT AG AAGACCC AATG AATGCT AAG AT 
GAAGTATTTCGCTTCTCAGCTAAACCAGTTTTACAAAGATCATCGCTGTCTGTGGGAAATTGATACCAG 
CT ATG ATGGT ATTG AAATC ATTG ATGCGG AT AATCG AG ACC AG AGTGTTCTTTC CTTTATTCG TAAGGG 
TAAAAAGGGA 

SP050 amino acid (SEQ ID NO:80) 

DFVEECHTHNIGVIVDWPXHFTIOTDALAYYDGTPTFEYQDHNK^ 
CIKHWIDVYHLDGIRVDAVSNMLYLDYDDAPWTPNKDGGNLNYEGYYF 

SSSAIKITGMKEIGGLGFDYKWNMGWMNDILRFYEEDPIYRKYDFNLVTFSFMYVXKENYLLPFSHDEV 
WGKKSMMHKMWGDRYNQFAGLRNLYTYQICHPGKKLLFMGSEYGQFLEWKSEEQLEWSNLEDPMNAKM 
KYFASQLNQFYKDHRCLWEIDTSYDGIEIIDADNRDQSVLSFIRKGKKG 

SP051 nucleotide (SEQ ID NO:81) 
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ATCTGTAGTTTATGCGGATGAAACACTTATTACTCATACTGCTGAGAAACCTAAAGAGGAAAAAATGAT 

AGTAGAAGAAAAGGCTGATAAAGCTTTGGAAACTAAAAATATAGTTGAAAGGACAGAACAAAGTGAACC 

TAGTTCAACTGAGGCTATTGCATCTGAGNAGAAAGAAGATGAAGCCGTAACTCCAAAAGAGGAAAAAGT 

GTC TGCT AAAC CGG AAG AAAAAGC TC C AAGG AT AG AATC AC AAGCTTC AAATC AAGAAAAACC GCTC AA 

GGAAGATGCTAAAGCTGTAACAAATGAAGAAGTGAATCAAATGATTGAAGACAGGAAAGTGGATTTTAA 

TCAAAATTGGTACTTTAAACTCAATGCAAATTCTAAGGAAGCCATTAAACCTGATGCAGACGTATCTAC 

GTGGAAAAAATTAGATTTACCGTATGACTGGAGTATCTTTAACGATTTCGATCATGAATCTCCTGCACA 

AAATGAAGGTGGACAGCTCAACGGTGGGGAAGCTTGGTATCGCAAGACTTTCAAACTAGATGAAAAAGA 

CCTCAAGAAAAATGTTCGCCTTACTTTTGATGGCGTCTACATGGATTCTCAAGTTTATGTCAATGGTCA 

GTTAGTGGGGCATTATCCAAATGGTTATAACCAGTTCTCATATGATATCACCAAATACCTTCAAAAAGA 

TGGTCGTGAGAATGTGATTGCTGTCCATGCAGTCAACAAACAGCCAAGTAGCCGTTGGTATTCAGGAAG 

TGGTATCTATCGTGATGTGACTTTACAAGTGACAGATAAGGTGCATGTTGAGAAAAATGGGACAACTAT 

TTTAACACCAAAACTTGAAGAACAACAACATGGCAAGGTTGAAACTCATGTGACCAGCA 

TACGGACGACAAAGACCATGAACTTGTAGCCGAATATCAAATCGTTGAACGAGGTGGTCATGCTGTAAC 

AGGCTTAGTTCGTACAGCGAGTCGTACCTTAAAAGCACATGAATCAACAAGCCTAGATGCGATTTTAGA 

AGTTGAAAGACCAAAACTCTGGACTGTTTTAAATGACAAACCTGCCTTGTACGAATTGATTACGCGTGT 

TTACCGTGACGGTCAATTGGTTGATGCTAAGAAGGATTTGTTTGGTTACCGTTACTATCACTGGACTCC 

AAATGAAGGTTTCTCTTTGAATGGTGAACGTATTAAATTCCATGGAGTATCCTTGCACCACGACCATGG 

GGCGCTTGGAGCAGAAGAAAACTATAAAGCAGAATATCGCCGTCTCAAACAAATGAAGGAGATGGGAGT 

TAACTCCATCCGTACAACCCACAACCCTGCTAGTGAGCAAACCTTGCAAATCGCAGCAGAACTAGGTTT 

ACTCG TTC AGG AAG AGGCCTTTG AT AC GTGG T ATGGTGGC AAG AAAC CTT ATG AC T ATGG ACGTTTC TT 

TGAAAAAGATGCCACTCACCCAGAAGCTCGAAAAGGTGAAAAATGGTCTGATTTTGACCTACGTACCAT 

GGTCGAAAGAGGCAAAAACAACCCTGCTATCTTCATGTGGTCAATTGGTAATGAAATAGGTGAAGCTAA 

TGGTG ATGC C C AC TC TTT AGC AACTGTT AAAC GTTTGGTT AAGGTT ATC AAGG ATGTTG AT AAG AC TC G 

CTATGTTACCATGGGAGCAGATAAATTCCGTTTCGGTAATGGTAGCGGAGGGCATGAGAAAATTGCTGA 

TGAACTCGATGCTGTTGGATTTAACTATTCTGAAGATAATTACAAAGCCCTTAGAGCTAAGCATCCAAA 

ATGGTTGATTT ATGG ATC AG AAAC ATC TTC AGC T AC C CGT AC ACG TGG AAGTT ACT ATCGC C C TG AAC G 

TGAAT TG AAAC AT AGC AATGG ACC TG AGC G T AATT ATG AAC AGTC AG ATT ATGGAAATG ATC GTGTGGG 

TTGGGGG AAAAC AGC AAC C G CTTC ATGG AC TTTTG AC CG TG AC AAC GC TGGCT ATGCTGG AC AGTTT AT 

CTGGACAGGTACGGACTATATTGGTGAACCTACACCATGGCACAACCAAAATCAAACTCCTGTTAAGAG 

CTCTTACTTTGGTATCGTAGATACAGCCGGCATTCCAAAACATGACTTCTATCTCTACCAAAGC 

SP051 amino acid ( SEQ ID NO: 82) 

SVVYADETLITHTAEKPKEEKMIVEEKADKALETKNIVERTEQSEPSSTEAIASEXKEDEAVTPKEEKV 
SAKPEEKAPRIE5QASNQEKPLKEDAJCAVTNEEVNQMIEDRKVDFNQNWYFKLNANSKEAIKPDADVST 
WKKLDLPYDWSIFNDFDHESPAQNEGGQU^GGEAVTCRK^^ 

L VG HY P NG YNQ F S YD I TK Y LQ KDG RENV I A VHA VNKQ P S SRWY SG SG I Y RDVT LQ VTD KVKVE KNGTT I 
LTPKLEEQQHGKVETHVTSKIVNTDDKDKELVAEYQIVERGGHAVTGLVRTASRTLKAHESTSLDAILE 
VERPKLWTVLNDKPALYELITRVYRDGQLVDAKKDLFGYRYYHWTPNEGFSLNGERIKFHGVSLHHDHG 
ALGAEENYKAEYrJILKQMKEMGVNSIRTTHNPASEQTLQIAAELGLLVQEEAFDTWYGGKKPYDYGRFF 
EKDATHPEAJIKGEKWSDFDLRTMVERGKNNPAIFMWSIGN^ 

YWMGADKFRFGNGSGGHEKIADELDAVGFNYSEDNYKALRAKHPKWLIYGSETSSATRTRGS^^YRPER 
ELKHSNGPERNYEQSDYGfTORVGWGKTATASWTFDRDNAGYAGQFIWTGTDYIGEPTPWHNQNQTPVKS 
SYFGIVDTAGIPKHDFYLYQS 

SP052 nucleotide (SEQ ID NO: 83) 

TTACTTTGGTATCGTAGATACAGCCGGCATTCCAAAACATGACTTCTATCTCTACCAAAGCCAATGGGT 
TTCTG TT AAGAAG AAACCG ATGGT AC AC C T TCTTCCTC AC TGG AAC TGGG AAAAC AAAG AATT AGC AT C 
CAAAGTAGCTGACTCAGAAGGTAAGATTCCAGTTCGTGCTTATTCGAATGCTTCTAGTGTAGAATTGTT 
CTTGAATGGAAAATCTCTTGGTCTTAAGACTTTCAATAAAAAACAAACCAGCGATGGGCGGACTTACCA 
AGAAGGTGCAAATGCTAATGAACTTTATCTTGAATGGAAAGTTGCCTATCAACCAGGTACCTTGGAAGC 
AATTGCTCGTGATGAATCTGGCAAGGAAATTGCTCGAGATAAGATTACGACTGCTGGTAAGCCAGCGGC 
AGTTCGTCTTATTAAGGAAGACCATGCGATTGCAGCAGATGGAAAAGACTTGACTTACATCTACTATGA 
AATTGTTGACAGCCAGGGGAATGTGGTTCCAACTGCTAATAATCTGGTTCGCTTCCAATTGCATGGCCA 
AGGTCAACTGGTCGGTGTAGATAACGGAGAACAAGCCAGCCGTGAACGCTATAAGGCGCAAGCAGATGG 
TTCTTGGATTCGTAAAGCATTTAATGGTAAAGGTGTTGCCATTGTCAAATCAACTGAACAAGCAGGGAA 
ATTCACCCTGACTGCCCACTCTGATCTCTTGAAATCGAACCAAGTCACTGTCTTTACTGGTAAGAAAGA 
AGGACAAGAGAAGACTGTTTTGGGGACAGAAGTGCCAAAAGTACAGACCArTATTGGAGAGGCACCTGA 
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AATGCCT ACC AC TGTTCCGTTTG T AT AC AGTG ATGGT AGCCGTGC AG AACG TC CTGT AAC C TGGTC TTC 
AGTAGATGTGAGCAAGCCTGGTATTGTAACGGTGAAAGGTATGGCTGACGGACGAGAAGTAGAAGCTCG 
TGT AGAAGTGATTGC TCTT AAATC AG AGC T AC C AGTTGTG AAAC GT ATTGC TC C AAAT AC TG ACTTG AA 
TTCTGTAGACAAATCTGTTTCCTATGTTTTGATTGATGGAAGTGTTGAAGAGTATGAAGTGGACAAGTG 
GGAGATTGCCGAAGAAGATAAAGCTAAGTTAGCAATTCCAGGTTCTCGTATTCAAGCGACCGGTTATTT 
AGAAGGTCAACCAATTCATGCAACCCTTGTGGTAGAAGAAGGCAATCCTGCGGCACCTGCAGTACCAAC 
TGTAACGGTTGGTGGTGAGGCAGTAACAGGTCTTACTAGTCAAAAACCAATGCAATACCGCACTCTTGC 
TTATGGAGCTAAGTTGCCAGAAGTCACAGCAAGTGCTAAAAATGCAGCTGTTACAGTTCTTCAAGCAAG 
CGCAGCAAACGGCATGCGTGCGAGCATCTTTATTCAGCCTAAAGATGGTGGCCCTCTTCAAACCTATGC 
AATTCAATTCCTTGAAGAAGCGCCAAAAATTGCTCACTTGAGCTTGCAAGTGGAAAAAGCTGACAGTCT 
CAAAGAAGACCAAACTGTCAAATTGTCGGTTCGAGCTCACTATCAAGATGGAACGCAAGCTGTATTACC 
AGCTGATAAAGTAACCTTCTCTACAAGTGGTGAAGGGGAAGTCGCAATTCGTAAAGGAATGCTTGAGTT 
GCATAAGCCAGGAGCAGTCACTCTGAACGCTGAATATGAGGGAGCTAAAGACCAAGTTGAACTCACTAT 
CCAAGCCAATACTGAGAAGAAGATTGCGCAATCCATCCGTCCTGTAAATGTAGTGACAGATTTGCATCA 
GGAACCAAGTCTTCCAGCAACAGTAACAGTTGAGTATGACAAAGGTTTCCCTAAAACTCATAAAGTCAC 
TTGGCAAGCTATTCCGAAAGAAAAACTAGACTCCTATCAAACATTTGAAGTACTAGGTAAAGTTGAAGG 
AATTGACCTTGAAGCGCGTGCAAAAGTCTCTGTAGAAGGTATCGTTTCAGTTGAAGAAGTCAGTGTGAC 
AACTCCAATCGCAGAAGCACCACAATTACCAGAAAGTGTTCGGACATATGATTCAAATGGTCACGTTTC 
ATCAGCTAAGGTTGCATGGGATGCGATTCGTCCAGAGCAATACGCTAAGGAAGGTGTCTTTACAGTTAA 
TGGTCGCTTAGAAGGTACGCAATTAACA 

5P052 amino acid ( SEQ ID NO: 84) 

YTGIVDTAGIPKHDFYLYQSQWVSVKKKPMVHLLPHWNW^ 

LNGKSLGLKTFNKKQTSDGRTYQEGANANELYLEWKVAYQPGTLEAIARDESGKEIARDKITTAGKPAA 
VRLIKEDHAIAADGIQLTYIYYEIVDSQGNWPTANNLVRFQLHGQGQLVGVDNGEQASRERYKAQADG 
SWIRKAFNGKGVAIVKSTEOAGKFTLTAHSDLLKSNQVTVFTGKKEGQEKTVLGTEVPKVQTIIGEAPE 
MPTWPFVYSDGSRAERPVTWSSVDVSKPGIVTVKGMADGREVEARVEVIALKSELPVVKRIAPNTD^ 
SVDKSVSYVLIDGSVEEYEVDKWEIAEEDKAKLAIPGSRIQATGYLEGQPIHATLWEEGNPAAPAVPT 

VTVGGEAVTGLT SQKPMQ YRTLA YGAKL P EVTAS AKNAAVTVLQ AS AANGMRAS I F I Q P KDGG ? LQTYA 
IQFLEEAPKIAHLSLQVEKADSLKEDQTVKLSVRAHYQDGTQAVLPADKVTFSTSGEGEVAIRKGMLEL 

HKPGAVTLNAEYEGAKDQVELTIQANTEKKIAQSIRPVNV^ 

WQAIPKEKLDSYQTFEVLGKVEGIDLEARAKVSVEGIVSVEEVSVTTPIAEAPQLPESVRTYDSNGHVS 
S AKV AWD A I R P E Q Y AKEG VFTVNG RL EGT Q LT 

SP053 nucleotide (SEQ ID NO: 85) 

AGCTAAGGTTGCATGGGATGCGATTCGTCCAGAGCAATACGCTAAGGAAGGTGTCTTTACAGTTAATGG 
TCGCTTAGAAGGTACGCAATTAACAACTAAACTTCATGTTCGCGTATCTGCTCAAACTGAGCAAGGTGC 
AAACATTTCTGACCAATGGACCGGTTCAGAATTGCCACTTGCCTTTGCTTCAGACTCAAATCCAAGCGA 
CCCAGTTTCAAATGTTAATGACAAGCTCATTTCCTACAATAACCAACCAGCCAATCGTTGGACAAACTG 
GAATCGTACTAATCCAGAAGCTTCAGTCGGTGTTCTGTTTGGAGATTCAGGTATCTTGAGCAAACGCTC 
CGTTGATAATCTAAGTGTCGGATTCCATGAAGACCATGGAGTTGGTGTACCGAAGTCTTATGTGATTGA 
GTATTATGTTGGTAAGACTGTCCCAACAGCTCCTAAAAACCCTAGTTTTGTTGGTAATGAGGACCATGT 
CTTTAATGATTCTGCCAACTGGAAACCAGTTACTAATCTAAAAGCCCCTGCTCAACTCAAGGCTGGAGA 
AATGAACCACTTTAGCTTTGATAAAGTTGAAACCTATGCTGTTCGTATTCGCATGGTTAAAGCAGATAA 
CAAGCGTGGAACGTCTATCACAGAGGTACAAATCTTTGCGAAACAAGTTGCGGCAGCCAAGCAAGGACA 
AACAAGAATCCAAGTTGACGGCAAAGACTTAGCAAACTTCAACCCTGATTTGACAGACTACTACCTTGA 
GTCTGTAGATGGAAAAGTTCCGGCAGTCACAGCAAGTGTTAGCAACAATGGTCTCGCTACCGTCGTTCC 
AAGCGTTCGTGAAGGTGAGCCAGTTCGTGTCATCGCGAAAGCTGAAAATGGCGACATCTTAGGAGAATA 
CCGTCTGCACTTCACTAAGGATAAGAGCTTACTTTCTCATAAACCAGTTGCTGCGGTTAAACAAGCTCG 
CTTGCTACAAGTAGGTCAAGCACTTGAATTGCCGACTAAGGTTCCAGTTTACTTCACAGGTAAAGACGG 
CTACGAAACAAAAGACCTGACAGTTGAATGGGAAGAAGTTCCAGCGGAAAATCTGACAAAAGCAGGTCA 
ATTTACTGTTCGAGGCCGTGTCCTTGGTAGTAACCTTGTTGCTGAGATCACTGTACGAGTGACAGACAA 

ACTTGGTG AG AC TC TTTC AG AT AAC C C T AAC T ATG ATG AAAAC AGT AAC C AGGC C TTTGC TT C AGC AAC 
CAATGATATTGACAAAAACTCTCATGACCGCGTTGACTATCTCAATGACGGAGATCATTCAGAAAATCG 
TCGTTGGACAAACTGGTCACCAACACCATCTTCTAATCCAGAAGTATCAGCGGGTGTGATTTTCCGTGA 
AAATGGTAAGATTGTAGAACGGACTGTTACACAAGGAAAAGTTCAGTTCTTTGCAGATAGTGGTACGGA 
TGCACCATCTAAACTCGTTTTAGAACGCTATGTCGGTCCAGAGTTTGAAGTGCCAACCTACTATTCAAA 
CTACCAAGCCTACGACGCAGACCATCCATTCAACAATCCAGAAAATTGGGAAGCTGTTCCTTATCGTGC 
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GGATAAAGACATTGCAGCTGGTGATGAAATCAACGTAACATTTAAAGCTATCAAAGCCAAAGCTATGAG 

ATGGCGTATGGAGCGTAAAGCAGATAAGAGCGGTGTTGCGATGATTGAGATGACCTTCCTTGCACCAAG 

TGAATTGCCTCAAGAAAGCACTCAATCAAAGATTCTTGTAGATGGAAAAGAACTTGCTGATTTCGCTGA 

AAATCGTCAAGACTATCAAATTACCTATAAAGGTCAACGGCCAAAAGTCTCAGTTGAAGAAAACAATCA 

AGTAGCTTCAACTGTGGTAGATAGTGGAGAAGATAGCTTTCCAGTACTTGTTCGCCTCGTTTCAGAAAG 

TGGAAAACAAGTCAAGGAATACCGTATCCACTTGACTAAGGAAAAACCAGTTTCTGAGAAGACAGTTGC 

TGCTGTACAAGAAGATCTrcCAAAAATCGAATTTGTTGAAAAAGATTTGGCATACAAGACAGTTC 

AAAAGATTCAACACTGTATCTAGGTGAAACTCGTGTAGAACAAGAAGGAAAAGTTGGAAAAGAACGTAT 

CTTTACAGCGATTAATCCTGATGGAAGTAAGGAAGAAAAACTCCGTGAAGTGGTAGAAGTTCCGACAGA 

CCGCATCGTCTTGGTTGGAACCAAACCAGTAGCTCAAGAAGCTAAAAAACCACAAGTGTCAGAAAAAGC 

AGATACAAAACCAATTGATTCAAGTGAAGCTAGTCAAACTAATAAAGCCCAG 

SP0S3 amino acid (SEQ ID NO: 86) 

AKVAWDAIRPEQYAKEGVFTVNGRLEGTQLTTKLJiVTWS^^ 

PVSNVNDKLISYNNQPANRWTNWNRTN^^ 

YWGKWPTAPKNPSFVGNEDHVFNDSANWKPVTO^ 

KRGTSITEVQIFAKQVAAAKQGQTRIQVDGKDIANFNPDLTDYYLESVDGKVPAW 

SVREGEPVRVIAKAENGDILGEYRLHFTKDKSLLSHKPVAAVKQARLLQVGQALELPTKVPVYFTGKDG 

YETKDLTOE7^E^PAENLTKAGQFTVRGRVLGSNLVAEITVRVTDKIX3ETLSDNPOT 

^IDKNSHDRVDYLNDGDHSENRRWTNWSPTP^ 
APSKLVLERWGPEFEVPTYYSNYQAYDADHPFNNPENWEAVP 

WRMERKADKSGVAMIEMTFLAPSELPQESTQSKILVDGKEIADFAEOTQDYQITYKGQRPKVSVEE.NNQ 
VAST^/VDSGEDSFPVLVRLVSESGXQVKEraiHLTKEKPVSEKTVAAVQEDLPKIEFVEKDLAYKTVEK 
KDSTLYLGETRVEQEGKVGKERIFTAINPDGSKEEKLREWEVPTDRIVLVGTKPVAQEAKKPQVSEKA 

DTKPIDSSEASQTNKAQ 

SP054 nucleotide (SEQ ID NO: 87) 

CTATCACTATGTAAATAAAGAGATTATTTCACAAGAAGCTAAAGATTTAATTCAGACAGGAAAGCCTGA 
CAGGAATGAAGTTGTATATGGTTTGGTGTATCAAAAAGATCAGTTGCCTCAAACAGGGACAGAA 

SPO 54 amino acid (SEQ ID NO:88) 

YHYVNKEIISQEAKDLIQTGKPDRNEWYGLVYQKDQLPQTGTE 

SP055 nucleotide (SEQ ID NO: 89) 

TG AG AC TC C TC AATC AAT AAC AAATC AGG AGC AAGC T AGG AC AG AAAAC C AAG T AGT AG AG AC AG AGG A 
AGCTCCAAAAGAAGAAGCACCTAAAACAGAAGAAAGTCCAAAGGAAGAACCAAAATCGGAGGTAAAACC 
T ACTG ACG AC AC CC TTCCT AAAGT AG AAG AGGGG AAAG AAG ATTC AGC AG AACC AGC TC C AGTTG AAG A 
AGTAGGTGGAGAAGTTGAGTCAAAACCAGAGGAAAAAGTAGCAGTTAAGCCAGAAAGTCAACCATCAGA 
CAAACC AGC TG AGG AATC AAAAGTTG AAC AAGC AGGTG AACC AGTCGCGCC AAG AG AAG ACG AAAAGGC 
ACC AGTCG AGC C AG AAAAGC AAC C AG AAGC TCC TG AAG AAG AG AAGGCTG T AG AGG AAAC AC C G AAAC A 
AGAAGAGTCAACTCCAG AT ACCAAGGCTG AAG AAACTGT AG AACC AAAAG AGG AGACTGTT AATC AATC 
TATTG AAC AACC AAAAGTTG AAACGCCTGCTGT AG AAAAAC AAAC AG AACC AAC AG AGG AACC AAAAGT 
TGAACAAGCAGGTGAACCAGTCGCGCCAAGAGAAGACGAACAGGCACCAACGGCACCAGTTGAGCCAGA 
AAAGCAACCAGAAGTTCCTGAAGAAGAGAAGGCTGTAGAGGAAACACCGAAACCAGAAGATAAAATAAA 
GGGTATTGGTACTAAAGAACCAGTTGATAAAAGTGAGTTAAATAATCAAATTGATAAAGCTAGTTCAGT 

TTCTCCTACTGATTAT 

SP055 amino acid (SEQ ID NO: 90) 

ETPQSITTIQEQARTENQVVETEEAPKEEAPKTEESPKEEPKSEVKPTDDTLPKVEEGKEDSAEPAPVEE 
VGGEVESKPEEKVAVKPESQPSDKPAEESKVEQAGEPVAPREDEKAPVEPEKQPEAPEEEKAVEETPKQ 
EESTPDTKAEET\^PKEETVNQSIEQPKVET?AVEKQTEPTEEPKVEQAGEPVAPREDEQAPTAPVEPE 
KQPEVPEEEKAVEETPKPEDKIKGIGTKEPVDKSELNNQIDKASSVSPTDY 

SP056 nucleotide (SEQ ID NO: 91) 

GGATGCTCAAGAAACTGCGGGAGTTCACTATAAATATGTGGCAGATTCAGAGCTATCATCAGAAGAAAA 

GAAGCAGCTTGTCTATGATATTCCGACATACGTGGAGAATGATGATGAAACTTATTATCTTC 

GTTAAATTCTCAAAATCAACTGGCGGAATTGCCAAATACTGGAAGCAAGAATGAGAGGCAA 
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SP056 amino acid (SEQ ZD NO: 92) 

DAQETAGVHYKWADSELSSEEKKQLVYDIPTWENDDETYYLV^ 

SP057 nucleotide (SEQ ID NO: 93) 

CGACAAAGGTGAGACTGAGGTTCAACCAGAGTCGCCAGATACTGTGGTAAGTGATAAAGGTGAACCAGA 
GCAGGTAGCACCGCTTCCAGAATATAAGGGTAATATTGAGCAAGTAAAACCTGAAACTCCGGTTGAGAA 
GACCAAAGAACAAGGTCCAGAAAAAACTGAAGAAGTTCCAGTAAAACCAACAGAAGAAACACCAGTAAA 
TCCAAATGAAGGTACTACAGAAGGAACCTCAATTCAAGAAGCAGAAAATCCAGTTCAACCTGCAGAAGA 
ATCAACAACGAATTCAGAGAAAGTATCACCAGATACATCTAGCAAAAATACTGGGGAAGTGTCCAGTAA 
TCCTAGTGATTCGACAACCTCAGTTGGAGAATCAAATAAACCAGAACATAATGACTCTAAAAATGAAAA 
TTCAGAAAAAACTGTAGAAGAAGTTCCAGTAAATCCAAATGAAGGCACAGTAGAAGGTACCTCAAATCA 
AGAAACAGAAAAACCAGTTCAACCTGCAGAAGAAACACAAACAAACTCTGGGAAAATAGCTAACGAAAA 
TACTGGAGAAGTATCCAATAAACCTAGTGATTCAAAACCACCAGTTGAAGAATCAAATCAACCAGAAAA 
AAAC GG AACTGC AAC AAAAC C AG AAAATT C AGG T AAT AC AAC ATC AGAG AATGG AC AAAC AG AAC C AG A 
ACCATCAAACGGAAATTCAACTGAGGATGTTTCAACCGAATCAAACACATCCAATTCAAATGGAAACGA 
AGAAAT^AAACAAGAAAATGAACTAGACCCTGATAAAAAGGTAGAAGAACCAGAGAAAACACTTGAATT 
AAGAAAT 

SP057 amino acid ( SEQ ID NO: 94) 

DKGETEVQPES?DTWSDKGE?EQVAPL?EYKGNIEQVKPETPVEKTKEQG?EKTEEVPVK?TEETPVN 
PNEGTTEGTSIQEAE^PVQPAEESTTNSEKVSPDTSSKNTGEVSSNPSDSTTSVGESNKPEHNDSKNEN 
SEKTVEEVPVNPNEGTVEGTSNQETEKPVQPAEETQTNSGKIANEOTGEVSNKPSDSKPPVEESNQPEK 
NGTATKPENSGNTTSENGQTEPEPSNGNSTEDVSTESNTSNSNGNEEIKQENELDPDKKVEEPEKTLEL 

SP058 nucleotide (SEQ ID NO:95) 

AAATCAATTGGTAGCACAAGATCCAAAAGCACAAGATAGCACTAAACTGACTGCTGAAAAATCAACTGT 
TAAAGCACCTGCTCAAAGAGTAGATGTAAAAGATATAACTCATTTAACAGATGAAGAAAAAGTTAAGGT 
TGCTATTTTACAAGCAAATGGTTCAGCATTAGACGGAGCGACAATCAATGTAGCTGGAGATGGTACAGC 
AACAATCACATTCCCAGATGGTTCAGTAGTGACGATTCTAGGAAAAGATACAGTTCAACAATCTGCGAA 
AGGTGAATCTGTAACTCAAGAAGCTACACCAGAGTATAAGCTAGAAAATACACCAGGTGGAGATAAGGG 
AGGCAATACTGGAAGCTCAGATGCTAATGCGAATGAAGGCGGTGGTAGCCAGGCGGGTGGATCAGCTCA 
CACAGGTTCACAAAACTCAGCTCAATCACAAGCTTCTAAGCAATTAGCTACTGAAAAAGAATCAGCTAA 
AAATGC C ATTG AAAAAGC AGCC AAGG AC AAGC AGG ATG AAATC AAAGGC GC AC CGCTTTCTG AT AAAG A 
AAAAGC AG AACTTTT AGC AAG AGTGG AAGC AG AAAAAC AAGC AGC TC TC AAAG AG ATTG AAAATGCG AA 
AAC TATGG AAG ATGTG AAGG AAGC AG AAACGATTGGAGTGC AAGC C ATTG CCATGGTT AC AGTTCCTAA 

GAGACCAGTGGCTCCTAAT 

SP058 amino acid (SEQ ID NO: 96) 

NQLVAQDPKAQDSTKLTAEKSTVKAPAQRVDVKDITHLTDEEKVKVAILQANGSALDGATINVAGDGTA 
TITFPIX5SVVTILGKDTVQQSAKGESVTQEATPEYKLENTPGGDKGGNTGSSDANANEGGGSQAGGSAH 
TGSQNSAQSQASKQLATEKESAKNAIEKAAKDKQDEIXGAPLSDKEKAELLARVEAEKQAALKEIENAK 
TMEDVKEAETIGVQAIAMVTVPKRPVAPN 

SP059 nucleotide (SEQ ID NO:97) 

C AAAC AG TC AGC TTC AGG AAC G ATTG AGGTG A TTTC ACG AG AAAATGGC T CTGGG AC ACGGGGTGC C TT 
C AC AG AAATC AC AGGGATTC T C AAAAAAG ACGGTG AT AAAAAAATTG AC AAC ACTGC C AAAAC AGC TG T 
GATTCAAAATAGTACAGAAGGTGTTCTCTCAGCAGTTCAAGGGAATGCTAATGCTATCGGCTACATCTC 
CTTGGGATCTTTAACGAAATCTGTCAAGGCTTTAGAGATTGATGGTGTCAAGGCTAGTCGAGACACAGT 
TTT AG ATGGTG AAT AC C C TCTTC AACGTCC CTTC AAC ATTGTTTGG TC TTC T AATC TTTC C AAGC T AGG 
TCAAGATTTTATCAGCTTTATCCACTCCAAACAAGGTCAACAAGTGGTCACAGATAATAAATTTATTGA 
AGCTAAAACCGAAACCACGGAATATACAAGCCAACACTTATCAGGCAAGTTGTCTGTTGTAGGTTCCAC 
TTC AG TATCTTCTTT AATGG AAAAATT AGC AG AAGCTT ATAAAAAAG AAAATC C AG AAG TT AC G ATTG A 
TATTACCTCTAATGGGTCTTCAGCAGGTATTACCGCTGTTAAGGAGAAAACCGCTGATATTGGTATGGT 
TTCTAGGGAATTAACTCCTGAAGAAGGTAAGAGTCTCACCCATGATGCTATTGCTTTAGACGGTATTGC 
TGTTGTGGTCAATAATGACAATAAGGCAAGCCAAGTCAGTATGGCTGAACTTGCAGACGTTTTTAGTGG 
C AAA TTAACC AC CTGGG AC AAG ATT AAA 
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SP059 amino acid (SEQ ID HO: 98) 

KQSASGTIEVISRENGSGTRGAFTEITGILKKIX3DKKIDNTAKTAVIQNSTEGVLSAVQGNANAIGYIS 
LGSLTKSVKALEIDGVKASRDTV1.DGEYPL0RPFNIWSSNLSKLGQDFISFIHSKQ 
AKTETTEYTSQHL5GKLSWGSTSVSSU4EKIAEAVTOCENPEVTIDITSNGSSAGITAVKEKTADIGMV 
SRELTPEEGKSLTHDAIALIX3IAVVVNNDNKASQVSMAEIADWSGKLTTWDKIK 

SP060 nucleotide (SEQ ID NO: 99) 

ATTCGATGATGCGGATGAAAAGATGACCCGTGATGAAATTGCCTATATGCTGACAAATAGTGAAGAAAC 

ATTGGATGCTGATGAGATTGAGATGCTACAAGGTGTCTTTTCGCTCGATGAACTGATGGCACGAGAGGT 

TATGGTTCCTCGAACGGATGCCTTTATGGTGGATATTCAGGATGATAGTCAAGCCATTATCCAAAGTAT 

TTTAAAACAAAATTATTCTCGTATCCCGGTTTATGATGGGGATAAGGACAATC 

CACCAAGAGTCTCCTTAAGGCAGGCTTTGTGGACGGTTTTGACAATA 

AGATCCACTTTTTGTACCTGAAACTATTTTTGTGGATGACTTGCTAAAAGAACTGCGAAATACCCAAAG 
ACAAATG 

SP060 amino acid (SEQ ID NO: 100) 

FDDADEKMTRDEIAYMLTNSEZTLDADEIEMLQGVFSLDELMAREVMVPRTDAFMVDIQDDSQAIIQSI 

LKQIWSRIPVYDGDKDhA/IGIIHTKSLLKAGFVDGFDNIVWKRI^ 

QM 

SP062 nucleotide (SEQ ID NO: 101) 

GGAGAGTCGATCAAAAGTAGATGAAGCTGTGTCTAAGTTTGAAAAGGACTCATCTTCTTCGTCAAGTTC 
AGACTCTTCCACTAAACCGGAAGCTTCAGATACAGCGAAGCCAAACAAGCCGACAGAACCAGGAGAAAA 
GGTAGCAGAAGCTAAGAAGAAGGTTGAAGAAGCTGAGAAAAAAGCCAAGGATCAAAAAGAAGAAGATCG 
TCGTAACTACCCAACCATTACTTACAAAACGCTTGAACTTGAAATTGCTGAGTCCGATGTGGAAGTTAA 
AAAAGCGGAGCTTGAACTAGTAAAAGTGAAAGCTAACGAACCTCGAGACGAGCAA 

SP062 amino acid ( SEQ ID NO: 102) 

ESRSKVDEAVSKFEKDSSSSSSSDSSTKPEASDTAKPNKPTEPGEKVAEAKKKVEEAEKXAKDQKEEDR 
RNYPTITYKTLELEIAESDVEVKKAELELVKVKANEPRDEQ 

SP063 nucleotide (SEQ ID NO: 103) 

ATGGACAACAGGAAACTGGGACGAGGTTATATCTGGTAAGATTGACAAGTACAAAGATCCAGATATTCC 
AACAGTTGAATCACAAGAAGTTACGTCAGACTCTAGTGATAAAGAAATAACGGTAAGGTATGACCGTTT 
ATC AAC ACC AG AAAAAC C AATCCC AC AAC C AAATC C AG AGC AT C C AAGTGT TC CG AC AC C AAAC C C AG A 
ACTACCAAATCAAGAGACTCCAACACCAGATAAACCAACTCCAGAACCAGGTACTCCAAAAACTGAAAC 
TC C AGTG AATC C AG AC C C AG AAGTTC CG AC TT ATG AG AC AGGT AAG AG AG AGGAATTGC C AAAC AC AGG 
TACAGAAGCTAAT 

SP063 amino acid (SEQ ID NO:104) 

WTTGNWDEVISGKIDKYKDPDIPTVESQEVTSDSSDKEITVRYDRLSTPEKPIPQPNPEHPSVPTPNPE 
LPNQETPTPDKPTPEPGTPKTETPVNPDPEVPTYETGKREELPNTGTEAN 

SP064 nucleotide ( SEQ ID NO:105) 

C G ATGGGCTC AAT C C AACC C C AGGTC AAGTCTT AC CTG AAG AG AC ATCGGG AACG AAAG AGGG TG AC TT 
ATCAGAAAAACCAGGAGACACCGTTCTCACTCAAGCGAAACCTGAGGGCGTTACTGGAAATACGAATTC 
ACTTCCGACACCTACAGAAAGAACTGAAGTGAGCGAGGAAACAAGCCCTTCTAGTCTGGATACACTTTT 
TGAAAAAGATGAAGAAGCTCAAAAAAATCCAGAGCTAACAGATGTCTTAAAAGAAACTGTAGATACAGC 
TGATGTGGATGGGACACAAGCAAGTCCAGCAGAAACTACTCCTGAACAAGTAAAAGGTGGAGTGAAAGA 
AAATACAAAAGACAGCATCGATGTTCCTGCTGCTTATCTTGAAAAAGCTGAAGGGAAAGGTCCTTTCAC 
TGCCGGTGTAAACCAAGTAATTCCTTATGAACTATTCGCTGGTGATGGTATGTTAACTCGTCTATTACT 
AAAAGCTTCGGATAATGCTCCTTGGTCTGACAATGGTACTGCTAAAAATCCTGCTTTACCTCCTCTTGA 
AGGATTAACAAAAGGGAAATACTTCTATGAAGTAGACTTAAATGGCAATACTGTTGGTAAACAAGGTCA 
AGCTTTAATTGATCAACTTCGCGCTAATGGTACTCAAACTTATAAAGCTACTGTTAAAGTTTACGGAAA 
TAAAGACGGTAAAGCTGACTTGACTAATCTAGTTGCTACTAAAAATGTAGACATCAACATCAATGGATT 
AGTTGCTAAAGAAACAGTTCAAAAAGCCGTTGCAGACAACGTTAAAGACAGTATCGATGTTCCAGCAGC 
CTACCTAGAAAAAGCCAAGGGTGAAGGTCCATTCACAGCAGGTGTCAACCATGTGATTCCATACGAACT 
CTTCGCAGGTGATGGCATGTTGACTCGTCTCTTGCTCAAGGCATCTGACAAGGCACCATGGTCAGATAA 
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CGGCGACGCTAAAAACCCAGCCCTATCTCCACTAGGCGAAAACGTGAAGACCAAAGGTCAATACTTCTA 
TCAANTAGCCTTGGACGGAAATGTAGCTGGCAAAGAAAAACAAGCGCTCATTGACCAGTTCCGAGCAAA 
NGGTACTCAAACTTACAGCGCTACAGTCAATGTCTATGGTAACAAAGACGGTAAACCAGACTTGGACAA 
CATCGTAGCAACTAAAAAAGTCACTATTAACATAAACGGTTTAATTTCTAAAGAAACAGTTCAAAAAGC 
CGTTGCAGACAACGTTAANGACAGTATCGATGTTCCAGCAGCCTACCTAGAAAAAGCCAAGGGTGAAGG 
TCCATTCACAGCAGGTGTCAACCATGTGATTCCATACGAACTCTTCGCAGGTGATGGTATGTTGACTCG 
TCTCTTGCTCAAGGCATCTGACAAGGCACCATGGTCAGATAACGGNGACGCTAAAAACCCAGCNCTATC 
TCCACTAGGTGAAAACGTGAAGACCAAAGGTCAATACTTCTATCAANTAGCCTTGGACGGAAATGTAGC 
TGGCAAAGAAAAACAAGCGCTCATTGACCAGTTCCGAGCAAACGGTACTCAAACTTACAGCGCTACAGT 
CAATGTCTATGGTAACAAAGACGGTAAACCAGACTTGGACAACATCGTAGCAACTAAAAAAGTCACTAT 
TAAGATAAATGTTAAAGAAACATCAGACACAGCAAATGGTTCATTATCACCTTCTAACTCTGGTTCTGG 
C GTG AC T C C G ATG AATC AC AATC ATGC T AC AGGT ACT AC AG AT AGC ATGCC TGCTG AC ACC ATG AC AAG 
TTCT AC C AAC ACG ATGGC AGGTG AAAAC ATGGC TGCTTCTGCT AAC AAG ATGTCTG AT ACG ATG ATG TC 
AGAGGATAAAGCTATG 

5P064 amino acid (SEQ ID HO: 106) 

DGI^PTPGQVLPEETSGTKEGDLSEKPGDTVLTOAKPEGVTGNTNSLPTPTERTEVSEETSPSSLDTLF 
EKDEE^QKNPELTDVLKETVDTAD^/IXST^ASPAETTPEQVKGGVKENTKDSIDVPAAYLEKAEGKGPFT 
AGVNQVIPYEL7AGDGMLTRLLLXASDNAPWSDNGTAKNPALPPLEGLTKGKYFYEVDLNGNTVGKQGQ 
ALIDQLRANGTQTYKATVKVYGNKDGKADLTNLVATKNTO 

YL E KAKG EG P FT AG VNKVI P Y E L F AG DGM LT R LL L KA S D KA P WS DNGD AKN PAL S P LG ENVKT KGQ YF Y 
QXALDGNVAGKEKQALIDQFRAXGTQTYSATVNVYGNKDGKPD^ 

VADNVXDSIDVPAAYLEKAKGEGPFTAGVNHVIPYELFAGDGMLTRLLLKASDKAPWSDNGDAKNPALS 

PLGENVKTKGQYFYQXALDGNVAGKEKQALIDQFRANGTQTYSATVNWGNKIX3 

KIWKETSDTANGSLSPSNSGSGVTPMNHNHATGTTDSMPAOT^ 

EDKAM 

SP065 nucleotide (SEQ ID NO: 107) 

TTCCAATCAAAAACAGGCAGATGGTAAACTCAATATCGTGACAACCTTTTACCCTGTCTATGArTTTAC 
CAAGCAAGTCGCAGGAGATACGGCTAATGTAGAACTCCTAATCGGTGCTGGGACAGAACCTCATGAATA 
CG AAC CATC TGCC AAGGC AGTTGCC AAAATC C AAG ATGC AG AT ACC TTC G TTT ATG AAAA TG AAAAC AT 
GGAAACATGGGTACCTAAATTGCTAGATACCTTGGATAAGAAAAAAGTGAAAACCATCAAGGCGACAGG 
C G AT ATG TTGC TC TTGC C AGG TGGCG AGG AAG AAG AGGG AG AC C ATG AC C ATGG AG AAG AAGGTC ATC A 
CCATGAGTTTGACCCCCATGTTTGGTTATCACCAGTTCGTGCCATtAAACTAGTAGAGCACCATCCGCG 
ACACTTGTCAGCAGATTATCCTGATAAAAAAGAGACCTTTGAGAAGAATGCAGCTGCCTATATCGAAAA 
ATTGCAAGCCTTGGATAAGGCTTACGCAGAAGGTTTGTCTCAAGCAAAACAAAAGAGCTTTGTGACTCA 
ACACGCAgCCTTTAACTaTCTTGCCTTGGACTATGGGACTC 

SP065 amino acid (SEQ ID NO: 108) 

SNQKQ ADG K LN I VTT F Y P VY EFT KQ VAG DT ANV E L L I G AGT E P H E Y E P S AKA V AK I Q DADT FVY ENENM 
ETWPKLLDTLDKXKWTIKATGDMLLLPGGESEEGDHDHGEEGHHHEFDPHVWLSPVRAIKLVEHHPR 
HLSADYPDKKETFEKNAAAYIEKLQALDKAYAEGLSQAKQKSFVTQHAAFNYLALDYGT 

SP067 nucleotide (SEQ ID NO: 109) 

TATCACAGGATCGAACGGTAAGACAACCACAACGACTATGATTGGGGAAGTTTTGACTGCTGCTGGCCA 
ACATGGTCTTTTATCAGGGAATATCGGCTATCCAGCTAGTCAGGTTGCTCAAATAGCATCAGATAAGGA 
CACGCTTGTTATGGAACTTTCTTCTTTCCAACTCATGGGTGTTCAAGAATTCCATCCAGAGATTGCGGT 
TATTACCAACCTCATGCCAACTCATATCGACTACCATGGGTCATTTTCGGAATATGTAGCAGCCAAGTG 
G AAT AT CC AG AAC AAG ATG AC AGC AGCTG ATTTC CTTGTC TTG AACTTT AATC AAG AC TTGGC AAAAG A 
CTTGACTTCCAAGACAGAAGCCACTGTTGTACCATTTTCAACACTTGAAAAGGTTGATGGAGCTTATCT 
GG AAG ATGGT C AAC TCTACTTCCGTGGTGAAGTAGTC ATGGC AGCGAATGAAATCGGTGTTCC AGGT AG 
C C AC AATG TGG AAAATGC C C TTGC G AC T ATTGC TGT AGC C AAGCTTCGTG ATGTGG AC AATC AAA C C AT 
CAAGGAAACTCTTTCAGCCTTCGGTGGTGTCAAACACCGTCTCCAGTTTGTGGATGACATCAAGGGTGT 
T AAATT C TAT AAC G AC AG T AAATC AACT AAT ATC TTGGC T AC TC AAAAAGC C TTGTC AGG ATTTG AC AA 
C AGC AAGGTC GTC TTG ATTGC AGGTGGTTTGG AC CGTGGC AATG AGTTTG AC G AATTGGTGC C AG AC AT 
TACTGGACTCAAGAAGATGGTCATCCTGGGTCAATCTGCAGAACGTGTCAAACGGGCAGCAGACAAGGC 
TGGTGTCGCTTATGTGGAGGCGACAGATATTGCAGATGCGACCCGCAAGGCCTATGAGCTTGCGACTCA 



BNSDOCID: <WO 9618930A2J_> 



WO 98/18930 
Table 1 



72 



PCT/US97/19422 



AGGAGATGTGGTTCTTCTTAGTCCTGCCAATGCTAGCTGGGATATGTATGCTAACTTTGAAGTACGTGG 
CGACCTCTTTATCGACACAGTAGCGGAGTTAAAAGAA 

SP067 amino acid (SEQ ID NO: 110) 

GITGSNGKTTTTTMIGEVLTAAGQHGLLSGNIGYPASQVAQIASDKDTLVMELSSFQLMGVQEFHPEIA 

VITNLMPTHIDYHGSFSEWAAKWNIQNKMTAADFLVLNFNQDLATO 

LEDGQLYFRGEVVMAANEIGVPGSHN\raiAIJVTIAVAKIJlDVDNQTI 

VKFYNDSKSTNILATQKALSGFDNSKWLIAGGLDRGNEFDELVPDITGLKI^ 

AGVAYVEATDIADATRKAYEIATQGDVVLLSPAKASWDMYANFEVRGDLFIDTVAELKE 

SP068 nucleotide (SSQ ID NO: 111) 

AAGTTCATCGAAGATGGTTGGGAAGTCCACTATATCGGGGACAAGTGTGGTATCGAACACCAAGAAATC 

CTTAAGTCAGGTTTGGATGTCACCTTCCATTCTATTGCGACTGGAAAATTGCGTCGCTATTTCTCTTGG 

CAAAATATGCTGGACGTCTTCAAAGTTGGTTGGGGAATTGTCCAATCGCTCTTTATCATGTTGCG 

CGTCCACAGACCCTTTTTTCAAAGGGGGGCTTTGTCTCAGTACCGCCTGTTATCGCTGCGCGTGTGTCA 

GG AGTGCCTGTC7TT ATTC ACGAATCTG AC C TGTC T ATGGGCTTGGCC AAT AAAATCGC CT AT AAATTT 

GCGACTAAGATGTATTCAACCTTTGAACAAGCTTCGAGTTTGGCTAAGGTTGAGCATGTGGGAGCGG 

SP068 amino acid (SEQ ID NO: 112) 

SSSKMVGKSTISGTSWSNTKXSLSQVWMSPSILLRLENC^AISLGKICWTSSKLVGELS?IRSLSCCDC 
VHRPFFQRGALSQYRLLSLRVCQECLSLFTNLTCLWAWPIKSPINLRLRCIQPLNKLRVWLRLSMWER 

SP069 nucleotide (SEQ ID NO: 113) 

ATCGCTAGCTAGTGAAATGCAAGAAAGTACACG7AAATTCAAGG7TACTGCTGACCTAACAGATGCCGG 
TGTTGGAACGATTGAAGTTCCTTTGAGCATTGAAGATTTACCCAATGGGCTGACCGCTGTGGCGACTCC 
GCAAAAAATTACAGTCAAGATTGGTAAGAAGGCTCAGAAGGATAAGGTAAAGATTGTACCAGAGATTGA 
CCCTAGTCAAATTGATAGTCGGGTACAAATTGAAAATGTCATGGTGTCAGATAAAGAAGTGTCTATTAC 
GAGTGACCAAGAGACATTGGATAGAATTGATAAGATTATCGCTGTTTTGCCAACTAGCGAACGTATAAC 
AGGTAATTACAGTGGTTCAGTACCTTTGCAGGCAATCGACCGCAATGGTGTTGTCTTACCGGCAGTTAT 
CACTCCGTTTGATACAATAATGAAGGTGACTACAAAACCAGTAGCACCAAGTTCAAGCACATCAAATTC 
AAGTACAAGCAGTTCATCGGAGACATCTTCGTCAACGAAAGCAACTAGTTCAAAAACGAAT 

SP069 amino acid (SEQ ID NO:114) 

SIJVSEMQESTRKFKVTADLTDAGVGTIEVPLSIEDLPNGLTAVATPQKITVKIGKKAQKDKVKIVPEID 
PSQIDSRVQIENVMVSDKEVSITSDQETLDRIDKIIAVLPTSERITGNYSGSVPLQAIDRKGWLPAVI 
TPFDTIMKVTTKPVAPSSSTSNSSTSSSSETSSSTKATSSKTN 

SP070 nucleotide (SEQ ID NO: 115) 

GCACCAGATGGGGCACAAGGTTCAGGGATCAGATGTTGAAAAGTACTACTTTACCCAACGCGGTCTTGA 
GC AGGC AGG AATT AC C ATTCTTCC TTTTG ATG AAAAAAATC TAG AC GGTG AT ATGG AAATT ATCGCTGG 
AAATGCCTTTCGTCCAGATAACAACGTCGAAATTGCCTATGCGGACCAAAATGGTATCAGCTACAAACG 
TTACCATGAGTTTCTAGGTAGCTTTATGCGTGACTTTGTTAGCATGGGAGTAGCAGGAGCACATGGAAA 
AACTTCAACGACAGGTATGTTGTCTCATGTCTTGTCTCACATTACAGATACCAGCTTCTTGATTGGAGA 
TGGGACAGGTCGTGGTTCGGCCAATGCCAAATATTTTGTCTTTGAATCTGACGAATATGAGCGTCACTT 
C ATGC C TT AC C AC CC AG AAT ACTCTATT ATC ACC AAC ATTG ACTTTG AC C ATC C AG ATT ATTTC AC AAG 
TCTCGAGGATGTTTTTAATGCCTTTAACGACTATGCCAAACAAATCACCAAGGGTCTTTTTGTCTATGG 
TGAAGATGCTGAATTGCGTAAGATTACGTCTGATGCACCAATTTATTATTATGGTTTTGAAGCTGAAGG 
CAATG AC TTTGTAGCTAGTGATCTTCTTCGTTC AAT AAC TGGTTC AAC CTTCACCGTTC ATTTC CGTGG 
AC AAAACTTGGGGC AATTC C AC ATTCC AAC C TTTGGTCGTC AC AATATC ATG AATGCG AC AGC C GTT AT 
TGGT C TTCTTT AC AC AGC AGGATTTG ATTTG AACTTGG TGCGTG AGC ACTTG AAAAC ATTTGCCGGTGT 
TAAACGTCGTTTCACTGAGAAAATTGTCAATGATACAGTGATTATCGATGACTTTGCCCACCATCCAAC 
AGAAATTATTGCGACCTTGGATGCGGCTCGTCAGAAATACCCAAGCAAGGAAATTGTAGCAGTCTTTCA 
ACCGCATACCTTTACAAGAACCATTGCCTTGTTGGACGACTTTGCCCATGCTTTAAACCAAGCAGATGC 
TGTTTATCTAGCGCAAATTTATGGCTCGGCTCGTGAAGTAGATCATGGTGACGTTAAGGTAGAAGACCT 
AGCCAACAAAATCAACAAAAAACACCAAGTGATTACTGTTGAAAATGTTTCTCCACTCCTAGACCATGA 
CAATGCTGTTTACGTCTTTATGGGAGCAGGAGACATCCAAACCTATGAATACTCATTTGAGCGTCTCTT 
GTCTAACTTGACAAGCAATGTTCAA 
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SP070 amino acid (SEQ ID NO: 116) 

HQMGHKVQGSDVEKYYFTQRGLEQAGITILPFDEKNLDGDMEIIAGNAFRPDNNV£IAYADQNGISYKR 

YHEFLGSFMRDFVSMGVAGAHGKTSTTGMLSHVLSHITDTSFLIGDGTGRGSANAKYFVFESDEYERHF 

MPYHPEYSIITNIDFDHPDYFTSLEDVFNAFNDYAKQITKGLFVYGEDAELRKITSDAPIYYYGFEAEG 

NDFVASDLLRSITGSTFTVHFRGQNLGQFHIPTFGRHNIMNATAVIGLLYTAGFDLNLVREKLKTFAGV 

KRRFTEKIVNDTVI IDDFAHHPTEI I ATLDAARQKYPSKEIVAVFQPHTFTRTI ALLDDFAHALNQADA 

VYIAQIYGSAREVDHGDVKVEDUWKINKKHQVIWEW^ 

3NLTSNVQ 

SP071 nucleotide ( SEQ ID NO: 117) 

TTTTAACCCAACTGTTGGTACTTTCCTTTTTACTGCAGGATTGAGCTTGTTAGTTTTATTGGTTTCTAA 

AAGGG AAAATGG AAAG AAACG ACTTGTTC ATTTTCTGC TGTTG AC T AGC ATGGG AGTTC AATTGTTGC C 

GGCCAGTGCTTTTGGGTTGACCAGCCAGATTTTATCTGCCTATAATAGTCAGCTTTCTATCGGAGTCGG 

GGAACATTTACCAGAGCCTCTGAAAATCGAAGGTTATCAATATATTGGTTATATCAAAACTAAGAAACA 

GG ATAAT AC AG AGCTTTC AAGGAC AGTTG ATGGGAAAT ACTCTGCTC AAAG AG AT AGTC AAC C AAACTC 

TACAAAAACATCAGATGTAGTTCATTCAGCTGATTTAGAATGGAACCAAGGACAGGGGAAGGTTAGTTT 

AC AAGGTG AAGC ATC AGGGG ATG ATGG AC TTTC AG AAAAATCT TC T AT AGC AGC AG AC AAT C T ATC TTC 

TAATGATTCATTCGCAAGTCAAGTTGAGCAGAATCCGGATCACAAAGGAGAATCTGTAGTTCGACCAAC 

AGTGCC AGAAC AAGG AAATC C TG TGTCTGC T AC AAC GGTGC AG AG TGCGG AAG AGG AAGT ATTGGC G AC 

GACAAATGATCGACCAGAGTATAAACTTCCATTGGAAACCAAAGGCACGCAAGAACCCGGTCATGAGGG 

TGAAGCCGCAGTCCGTGAAGACTTACCAGTCTACACTAAGCCACTAGAAACCAAAGGTACACAAGGACC 

CGGACATGAAGGTGAAGCTGCAGTTCGCGAGGAAGAACCAGCTTACACAGAACCGTTAGCAACGAAAGG 

CACGCAAGAGCCAGGTCATGAGGGCAAAGCTACAGTCCGCGAAGAGACTCTAGAGTACACGGAACCGGT 

AGCGACAAAAGGCACACAAGAACCCGAACATGAGGGCGAaCGGsCAGTAGAAGAAGAACTTCCGGCTTT 

AGAGGTCACTACACGAAATAGAACGGAAATCCAGAATATTCCTTATACAACAGAAGAAATTCAGGATCC 

AACACTTCTGAAAAATCGTCGTAAGATTGAACGACAAGGGCAAGCAGGGACACGTACAATTCAATATGA 

AG AC T AC ATC G T AAATGGT AATG TCGT AG AAACT AAAG AAGTG TCACGAACTG AAGT AGCT C C GGTC AA 

CGAAGTCGTTAAAGTAGGAACACTTGTGAAAGTTAAACCTACAGTAGAAATTACAAACTTAACAAAAGT 

TG AG AAC AAAAAATCT AT AACTG T AAGTT AT AAC TT AAT AG AC ACT AC C T C AGC AT ATG TTTC TGC AAA 

AACGCAAGTTTTCCATGGAGACAAGCTAGTTAAAGAGGTGGATATAGAAAATCCTGCCAAAGAGCAAGT 

AATATCAGGTTTAGATTACTACACACCGTATACAGTTAAAACACACCTAACTTATAATTTGGGTGAAAA 

T AATG AGG AAAATACTGAAAC ATC AACTC AAG ATTTC C AATT AG AGT AT AAG AAAAT AG AG AT T AAAG A 

T ATTG ATTC AGT AG AA TT AT ACGGT AAAG AAAATG ATCG TT ATCGT AG A T A TTT AAG TC T AAG TG AAG C 

GCCGACTGATACGGCTAAATACTTTGTAAAAGTGAAATCAGATCGCTTCAAAGAAATGTACCTACCTGT 

AAAATCTATTACAGAAAATACGGATGGAACGTATAAAGTGACGGTAGCCGTTGATCAACTTGTCGAAGA 

AGGTACAGACGGTTACAAAGATGATTACACATTTACTGTAGCTAAATCTAAAGCAGAGCAACCAGGAGT 

TTACACATCCTTTAAACAGCTGGTAACAGCCATGCAAAGCAATCTGTCTGGTGTCTATACATTGGCTTC 

AG AT ATG AC C GC AG ATG AGGTG AGCTT AGG CG AT AAGC AG AC AAGTT ATC TC AC AGGTGCATTT AC AGG 

GAGCTTGATCGGTTCTGATGGAACAAAATCGTATGCCATTTATGATTTGAAGAAACCATTATTTGATAC 

ATTAAATGGTGCTACAGTTAGAGATTTGGATATTAAAACTGTTTCTGCTGATAGTAAAGAAAATGTCGC 

AGCGCTGGCGAAGGCAGCGAATAGCGCGAATATTAATAATGTTGCAGTAGAAGGAAAAATCTCAGGTGC 

GAAATCTGTTGCGGGATTAGTAGCGAGCGCAACAAATACAGTGATAGAAAACAGCTCGTTTACAGGGAA 

AC TT ATCGC AAATC ACC AGG AC AGT AAT AAAAATG AT ACTGG AGG AAT AG T AGGT AAT AT AAC AGG AAA 

TAG TTC G AG AGTT AAT AAAGTT AGGGT AG ATGC C TT AATCTC T AC T AATGC ACGC AAT AAT AAC C AAAC 

AGC TGGAGGG AT AGT AGGTAG ATT AGAAAATGGTGC ATTG AT ATC T AATT CGGTTGCT ACTGG AG AAAT 

ACGAAATGGTCAAGGATATTCTAGAGTCGGAGGAATAGTAGGATCTACGTGGCAAAACGGTCGAGTAAA 

TAATGTTGTGAGTAACGTAGATGTTGGAGATGGTTATGTTATCACCGGTGATCAATACGCAGCAGCAGA 

TGTG AAAAATGC AAGT AC ATC AG TTG AT AAT AG AAAAGC AG AC AG ATT C GC T AC AAAATT AT C AAAAG A 

C C AAAT AGACG CG AAAGTTG C TG ATT ATGG AATC AC AGT AAC TCTTG ATG AT A C TGGGC AAG ATTT AAA 

ACGTAATCTAAGAGAAGTTGATTATACAAGACTAAATAAAGCAGAAGCTGAAAGAAAAGTAGCTTATAG 

C AAC AT AG AAAAAC TGATGC C ATTC T AC AAT AAAG AC CT AGT AGTTC ACT ATGG T AAC AAAG T AGC G AC 

AACAGATAAACTTTACACTACAGAATTGTTAGATGTTGTGCCGATGAAAGATGATGAAGTAGTAACGGA 

TATTAATAATAAGAAAAATTCAATAAATAAAGTTATGTTACATTTCAAAGATAATACAGTAGAATACCT 

AGATGT AAC ATTC AAAG AAAAC TTC AT AAAC AGTC AAG TAATCG AAT AC AATG TTAC AGG AAAAG AAT A 

TATATTCACACCAGAAGCATTTGTTTCAGACTATACAGCGATAACGAATAACGTACTAAGCGACTTGCA 

AAATGTAACACTTAAC 
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FNPTVGTFLFTAGLSLLVLLVSKRENGKKRLVHFLLLTSMGVQLLPASAFGLTSQILSAYNSQLSIGVG 

EHLPEPLKIEGYQYIGYIKTKKQDNTELSRTVIX3KYSAQRDSQPNSTKTSDVVHSADLEWNQGQGXVSL 

QGEASGDIX3LSEKSSIAADNLSS^roSFASQVXQNPDHKGESVVRPTOPECGNPVSATTVQSAEEEVLAT 

TNDRPEYKLPLETKGTQEPGHEGEAAVREDLPVYTKPLETKGTQGPGKEGEAAVREEEPAYTEPLATKG 

TQEPGHEGKATVKEETLEYTEPVATKGTQEPEHEGERXVEEELPALEVTTRNRTEIQNIPYTTEEIQDP 

TLLKNRRKIERQGQAGTRTIQYEDYIWGNVVETKEVSRTEVAPWEVVKVGTLVKVKPTV^ 

ENKKS I TVSYNL I DTTSAYVSAKTQVFHGDKLVKEVDI ENPAKEQVI SGLDYYT P YTVKTHLTYNLGEN 

NEENTETSTQDFQLEYKKIEIKDIDSVELYGKENDRYRRYLSLSEAPTDTAKYFVKVXSDRFKEMYLPV 

KSITEOTDGTYKVTOAVO^LVEEGTDGYKDDYTFTVAKSK^^ 

DMTADEVSI^DKQTSYLTGAFTGSLIGSDGTKSYAIYDLKKPLFDTI^GATVRBLDIKTVSADSKENVA 
ALAKAANSANINNVAVEGKISGAKSVAGLVASATNWIENSSFTGKLIAOT^ 

S SRVNKVRVDAL I STNARNNNQTAGG IVGRLENGAL I SNSVATGE IRNGQGY S RVGG I VGSTWQNGRVN 
NWSNVDVGDGYVITGI^YAAADVKNASTSVDNRKADRFATKLSKDQIDAl^ 

RNLREVD YTRLNKA EAERKVA Y S N I EKLMP FYNKDL WHYGNKVATTDKL YTTELLDWPMKDDEWTD 
I NNKKNS I NKVMLH F KDNTVE Y L DVT F KENF I N SQ V I E YNVTG KE Y I FT P EAFVSD YT A I TNNVL S D LQ 
NVTLN 

SP07 2 nucleotide ( SEQ ID NO: 119) 

TTTTAACCCAACTGTTGGTACTTTCCTTTTTACTGCAGGATTGAGCT7GTTAGTTTTATTGGTTTCTAA 
AAGGG AAAATGG AAAG AAAC G ACTTGTTC A TTTTC TGCTGTTG ACT AGC ATGGG AGTTC AATTGT TGC C 
GGCCAGTGCTTTTGGGTTGACCAGCCAGATTTTATCTGCCTATAATAGTCAGCTTTCTATCGGAGTCGG 
GX3AACATTTACCAGAGCCTCTGAAAATCGAAGGTTATCAATATATTGGTTATATCAAAACTAAGAAACA 
GGATAATACAGAGCTTTCAAGGACAGTTGATGGGAAATACTCTGCTCAAAGAGATAGTCAACCAAACTC 
T AC AAAAAC ATC AG ATGT AGTT C ATT C AGC TG ATTT AG AATGG AACC AAGG AC AGGGG AAGG TT AGTTT 
ACAAGGTGAAGCATCAGGGGATGATGGACTTTCAGAAAAATCTTCTATAGCAGCAGACAATCTATCTTC 
TAATGATTCATTCGCAAGTCAAGTTGAGCAGAATCCGGATCACAAAGGAGAATCTGTAGTTCGACCAAC 
AGTGCCAGAACAAGGAAATCCTGTGTCTGCTACAACGGTGCAGAGTGCGGAAGAGGAAGTATTGGCGAC 
GACAAATGATCGACCAGAGTATAAACTTCCATTGGAAACCAAAGGCACGCAAGAACCCGGTCATGAGGG 
TG AAGC CGC AGTC C GTG AAG ACTT AC C AGTCT AC AC T AAGCC AC T AGAAACC AAAGGT AC AC AAGG AC C 
CGG AC ATG AAGGTG AAGCTGC AGTTCGC G AGG AAG AAC C AGCTT AC AC AG AAC CGTT AGC AAC G AAAGG 
CACGCAAGAGCCAGGTCATGAGGGCAAAGCTACAGTCCGCGAAGAGACTCTAGAGTACACGGAACCGGT 
AGCG AC AAAAGGC AC AC AAG AACC CG AAC ATG AGGGC G Aa C GG s C AGT AG AAG AAG AACTTC C GGCTTT 
AGAGGTCACTACACGAAATAGAACGGAAATCCAGAATATTCCTTATACAACAGAAGAAATTCAGGATCC 
AACACTTCTGAAAAATCGTCGTAAGATTGAACGACAAGGGCAAGCAGGGACACGTACAATTCAATATGA 
AGACTACATCGTAAATGGTAATGTCGTAGAAACTAAAGAAGTGTCACGAACTGAAGTAGCTCCGGTCAA 
CGAAGTCGTTAAAGTAGGAACACTTGTGAAAGTTAAACCTACAGTAGAAATTACAAACTTAACAAAAGT 
TG AG AAC AAAAAATCT AT AACTGT AAGTT AT AACTT AAT AG AC ACT AC CTC AGC AT ATGTTTC TGC AAA 
AACGCAAGTTTTCCATGGAGACAAGCTAGTTAAAGAGGTGGATATAGAAAATCCTGCCAAAGAGCAAGT 
AATATCAGGTTTAGATTACTACACACCGTATACAGTTAAAACACACCTAACTTATAATTTGGGTGAAAA 
TAATGAGGAAAATACTGAAACATCAACTCAAGATTTCCAATTAGAGTATAAGAAAATAGAGATTAAAGA 
TATTGATTCAGTAGAATTATACGGTAAAGAAAATGATCGTTATCGTAGA 

SP072 amino acid (SEQ ID NO: 120) 

FNPTVGTFLFTAGLSLLVLLVSKRENGKKRLVKFLLLTSMGVQLLPASAFGLTSQILSAYNSQLSIGVG 
EHLPEPLKIEGYQYIGYIKTKKQDOTELSRTVDGKYSAQRDSQPNSTKTSDVVHSADLEWNQGQGKVSL 
QGEASGDDGLSEKSSIAADNLSSNDSFASQVEQNPDHKGESVVRPTVPEQGNPVSATTVQSAEEEVIiAT 
TNDRPEYKLPLETKGTQEPGHEGEAAVREDLPVYTKPLETKGTQGPGHEGEAAVREEEPAYTEPLATKG 
TQEPGHEGKATVREETLEYTEPVATKGTQEPEHEGERXVEEELPALEVTTRNRTEIQNIPYTTEEIQDP 
TLLKNRRKIERC^QAGTRTIQYEDYI\™G^AA/ETKEVSR 

ENKKS ITVSYNLIDTTSAWSAKTQVFHGDKLVKEVDI ENPAKEQVI SGLDYYTPYTVXTHLTYNLG EN 
NEENTETSTQDFQLEYKKI EI KDI DSVELYGKENDRYRR 

SP073 nucleotide (SEQ ID NO:121) 

TCGTAGATATTTAAGTCTAAGTGAAGCGCCGACTGATACGGCTAAATACTTTGTAAAAGTGAAATCAGA 
TCGCTTCAAAGAAATGTACCTACCTGTAAAATCTATTACAGAAAATACGGATGGAACGTATAAAGTGAC 
GGT AGC CG TTG ATC AACTTGTC G AAG AAGGT AC AG ACGGTT AC AAAG ATG ATT AC AC ATTT AC TGT AGC 
TAAATCTAAAGCAGAGCAACCAGGAGTTTACACATCCTTTAAACAGCTGGTAACAGCCATGCAAAGCAA 
TCTGTCTGGTGTCTATACATTGGCTTCAGATATGACCGCAGATGAGGTGAGCTTAGGCGATAAGCAGAC 
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AAGTTATCTCACAGGTGCATTTACAGGGAGCTTGATCGGTTCTGATGGAACAAAATCGTATGCCATTTA 
TGATTTGAAGAAACCATTATTTGATACATTAAATGGTGCTACAGTTAGAGATTTGGATATTAAAACTGT 
TTCTGCTGATAGTAAAGAAAATGTCGCAGCGCTGGCGAAGGCAGCGAATAGCGCGAATATTAATAATGT 
TGCAGTAGAAGGAAAAATCTCAGGTGCGAAATCTGTTGCGGGATTAGTAGCGAGCGCAACAAATACAGT 
GATAGAAAACAGCTCGTTTACAGGGAAACTTATCGCAAATCACCAGGACAGTAATAAAAATGATACTGG 
AGGAATAGTAGGTAATATAACAGGAAATAGTTCGAGAGTTAATAAAGTTAGGGTAGATGCCTTAATCTC 
TACTAATGCACGCAATAATAACCAAACAGCTGGAGGGATAGTAGGTAGATTAGAAAATGGTGCATTGAT 
ATCTAATTCGGTTGCTACTGGAGAAATACGAAATGGTCAAGGATATTCTAGAGTCGGAGGAATAGTAGG 
ATCTACGTGGCAAAACGGTCGAGTAAATAATGTTGTGAGTAACGTAGATGTTGGAGATGGTTATGTTAT 
CACCGGTGATCAATACGCAGCAGCAGATGTGAAAAATGCAAGTACATCAGTTGATAATAGAAAAGCAGA 
CAGATTCGCTACAAAATTATCAAAAGACCAAATAGACGCGAAAGTTGCTGATTATGGAATCACAGTAAC 
TCTTGATGATACTGGGCAAGATTTAAAACGTAATCTAAGAGAAGTTGATTATACAAGACTAAATAAAGC 
AGAAGCTGAAAGAAAAGTAGCTTATAGCAACATAGAAAAACTGATGCCATTCTACAATAAAGACCTAGT 
AGTTCACTATGGTAACAAAGTAGCGACAACAGATAAACTTTACACTACAGAATTGTTAGATGTTGTGCC 
GATGAAAGATGATGAAGTAGTAACGGATATTAATAATAAGAAAAATTCAATAAATAAAGTTATGTTACA 
TTTCAAAGATAATACAGTAGAATACCTAGATGTAACATTCAAAGAAAACTTCATAAACAGTCAAGTAAT 
CGAATACAATGTTACAGGAAAAGAATATATATTCACACCAGAAGCATTTGTTTCAGACTATACAGCGAT 
AACG AAT AAC G TAG T AAGC G AC TTGC AAAATGT AAC AC TT AAC 

SP073 amino acid (SEQ ID NO: 122) 

RRYLSLSEAPTDTAKYFVKWSDRFKEMYLPVKSITENTDGTY^^ 

KSKAEQPGVYTSFKQLVTAMQSNLSGVYTLASDMTADEVSLGDKQTSYLTGAFTGSLIGSDGTKSYAIY 
DLKKPLFDTLNGATVFJ)LDIKTVSADSKENVAAI^^^ 

IENSSFTGKL I ANHQD S N KNDTGG I VGN I TGN S S R VNKVKVD AL I S TN ARNNNQT AGG I VG RL ENG A L I 
SNSVATGE I RNGQGYS RVGG I VG STWQNGRVNNWSNVDVGDG YVI TGDQ Y AAADVKNASTSVDNRKAD 
RFATKLSKDQIDAKVADYGITVTLDDTGQDLKRNLREWYTRLNKAEAERKVAYSNIEKLMPFYNKDLV 
VHYGNKVATTDKLYTTELLDWPMKDDEVOTDINNKKN^ 
EYNVTGKEYIFTPEAFVSDYTAITNNVLSDLQNVTLN 

SP074 nucleotide (SEQ ID NO: 123) 

CTTTGGTTTTGAAGGAAGTAAGCGTGGACAATTTGCTGTAGAAGGAATCAATCAACTTCGTGAGCATGT 
AGACACTCTATTGATTATCTCAAACAACAATTTGCTTGAAATTGTTGATAAGAAAACACCGCTTTTGGA 
GGCTCTTAGCGAAGCGGATAACGTTCTTCGTCAAGGTGTTCAAGGGATTACCGATTTGATTACCAATCC 
AGGATTGATTAACCTTGACTTTGCCGATGTGAAAACGGTAATGGCAAACAAAGGGAATGCTCTTATGGG 
TATTGGTATCGGTAGTGGAGAAGAACGTGTGGTAGAAGCGGCACGTAAGGCAATCTATTCACCACTTCT 
TGAAACAACTATTGACGGTGCTGAGGATGTTATCGTCAACGTTACTGGTGGTCTTGACTTAACCTTGAT 
TGAGGCAGAAGAGGCTTCACAAATTGTGAACCAGGCAGCAGGTCAAGGAGTGAACATCTGGCTCGGTAC 
TTCAATTGATGAAAGTATGCGTGATGAAATTCGTGTAACAGTTGTTGCAACGGGTGTTCGTCAAGACCG 
CGTAGAAAAGGTTGTGGCTCCACAAGCTAGATCTGCTACTAACTACCGTGAGACAGTGAAACCAGCTCA 
TTCACATGGCTTTGATCGTCATTTTGATATGGCAGAAACAGTTGAATTGCCAAAACAAAATCCACGTCG 
TTTGGAACCAACTCAGGCATCTGCTTTTGGTGATTGGGATCTTCGCCGTGAATCGATTGTTCGTACAAC 
AGATTCAGTCGTTTCTCCAGTCGAGCGCTTTGAAGCCCCAATTTCACAAGATGAAGATGAATTGGATAC 

ACCTCCATTTTTCAAAAATCGT 

SP074 amino acid (SEQ ID NO: 124) 

FGFEGSK^GQFAVEGINQLREKVDTLLIISNNNLLEIVDKKTPLLEALSEADNVLRQGVQGITDLITNP 
GLINLDFADVXTVMANKGNALMGIGIGSGEERWE1AARKAIYSPLLETTIDGAEDVIVNVTGGLDLTLI 
EAEEASQIVNQAAGQGVNIWLGTSIDESMRDEIRVTWATGVRQDRVEKWAPQARSATNYRETVKPAH 
SHGFDRHFDMAETVELPKQNPRRLEPTQASAFGDWDLRRESIVKTTDSWSPVERFEAPISQDEDELDT 

PPFFKNR 

SP075 nucleotide (SEQ ID NO: 125) 

CTACTACCTCTCGAGAGAAAGTGACCTAGAGGTGACCGTTTTTGACCATGAGCAAGGTCAAGCCACCAA 
GGCCGCAGCAGGAATTATCAGTCCTTGGTTTTCCAAACGCCGTAATAAAGCCTGGTACAAGATGGCGCG 
CTTGGGGGCTGATTTTTATGTGGATTTATTAGCTGATTTAGAGAAATCAGGACAAGAAATCGACTTTTA 
CCAGCGTTCGGGAGTCTTTCTCTTGAAAAAGGATGAATCCAATTTGGAAGAACTTTATCAACTGGCCCT 
CCAGCGCAGAGAAGAATCTCCCTTGATAGGGCAATTAGCCATTCTGAACCAAGCCTCAGCTAATGAATT 
ATTC C C TGGTTTGC AGGG ATTTG AC CGCCTG CTCT ATGCTTCTGGTGG AGCG AG AGT AG ATGG CC AACT 
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TTTAGTGACTCGTTTGCTGGAAGTCAGTCATGTCAAGCTGGTCAAAGAAAAAGTGACTCTGACACCGTT 

AGCATCAGGCTACCAGATTGGTGAAGAGGAGTTTGAGCAGGTTATTTTGGCGACGGGAGCTTGGTTGGG 

GGACATGTTAGAGCCTTTAGGTTATGAAGTGGATGTCCGTCCTCAAAAAGGACAACTACGAGATTATCA 

GCTTGCCCAAGACATGGAAGATTACCCTGTTGTCATGCCAGAAGGGGAGTGGGATTTGAT^ 

AGGTGGGAAATTATCCTTAGGCGCTACCCACGAAAATGACATGGGATTTGATTT^ 

CTTGCTCCAACAAATGGAGGAGGCCACCTTGACTCACTATCTGATTTTGGCTGAAGC 

TGAGCGTGTTGGAATCCGTGCCTACACCAGTGATTTCTCTCCTTTCTTTGGGCAGGTGCCTGACTTAAC 

TGGTGTCTATGCAGCCAGTGGACTAGGTTCATCAGGCCTCACAACTGGTCCTATCATTGGTTACCATCT 

AGCCCAACTGATCCAAGACAAGGAGTTGACCTTGGACCCTCTAAATTACCCAATTGAAAACTATGTCAA 

ACGAGTAAAAAGCGAA 

SF075 amino acid (SEQ IS NO: 126) 

YYLSRESDLEVTVFDHEQGQ ATKAAAG IIS PWF SKRRNKAWYKMARLG ADFYVDLLADLEKSGQE IDFY 
QRSGWLLKKDESNLEELYQLALQRREESPLIGQI^IL^OASAWELFPGLQGFDRLLYASGGARVDGQL 
LWRLLEVSHVKLVKEKVTLTPLASGYOIGEEEFEQVILATGAWLGDMLEPLGYEVDVRPQKGQLRDYQ 
UVQDMEDYPVVMPEGEWDLIPFAGGXLSLGATHENDMGFDLTVDETLLQQMEEATLTHYLILAEATSKS 
ERVGIRAYTSDFSPFFGQVPDLTGVYAASGLGSSGLTTGPIIGYHLAQLIQDKELTLDPL^rYPIE^^!A/K 
RVKSE 

SP076 nucleotide (SEQ ID NO: 127) 

TAAGGTCAAAAGTCAGACCGCTAAGAAAGTGCTAGAAAAGATTGGAGCTGACTCGGTTATCTCGCCAGA 
GTATGAAATGGGGCAGTCTCTAGCACAGACCATTCTTTTCCATAATAGTGTTGATGTCTTTCAGTTGGA 
TAAAAATGTGTCTATCGTGGAGATGAAAATTCCTCAGTCTTGGGCAGGTCAAAGTCTGAGTAAATTAGA 
CCTCCGTGGCAAATACAATCTGAATATTTTGGGTTTCCGAGAGCAGGAAAATTCCCCATTGGATGTTGA 
ATTTGGACCAGATGACCTCTTGAAAGCAGATACCTATATTTTGGCAGTCATCAACAACCAGTA'TTTGGA 
TACCCTA 

SP076 amino acid ( SEQ ID NO: 128) 

KVKSQTAKKVLEKIGADSVISPEYEMGQSLAQTILFHNSVDVFQLDKNVSIVEMKIPQSWAGQSLSKLD 
LRGKYNLNILGFREQENSPLDVEFGPDDLLKADTYILAVINNQYLDTL 

SP077 nucleotide { SEQ ID NO: 129) 

TGACGGGTCTCAGGATCAGACTCAGGAAATCGCTGAGTGTTTAGCTAGCAAGTATCCTAATATCGTTAG 
AGCCATCTATCAGGAAAATAAATGCCATGGCGGTGCGGTCAATCGTGGCTTGGTAGAGGCTTCTGGGCG 
CTATTTTAAAGTAGTTGACAGTGATGACTGGGTGGATCCTCGTGCCTACTTGAAAATTCTTGAAACTTG 
C AGG AAC TTG AGAGC AAAGGTC AAG AGGTGG ATGTC TTTG 

SP077 amino acid (SEQ ID NO: 130) 

DGSQDQTQEIAECLASKYPNIVRAIYQENKCHGGAVNRGLVEASGRYFKWDSDDWVDPRAYLKILETC 
RNLRAKVKRWMSL 

SP078 nucleotide (SEQ ID NO:131) 

T AG AGGCTTTGC C AAATGGTGGG AAG GGC ACG AGCGTCG AAAAG AGG AACGC TTTGTC AAAC AAG AAG A 

AAAAG CTCGC C AAAAGGCTG AG AAAG AGG C T AG ATT AG AAC AAG AAG AGAC TG AAAAAGC CTT ACTC G A 

TTTGCCTCCTGTTGATATGGAAACGGGTGAAATTCTGACAGAGGAAGCTGTTCAAAATCTTCCACCTAT 

TCCAGAAGAAAAGTGGGTGGAACCAGAAATCATCCTGCCTCAAGCTGAACTTAAATTCCCTGAACAGGA 

AG ATG AC TC AG ATG ACG AAGATGTTC AGGTC G ATTTTTC AGC C AAAGAAGCC CTTG AAT AC AAAC TTCC 

AAGCTTACAACTCTTTGCACCAGATAAACCAAAAGATCAGTCTAAAGAGAAGAAAATTGTCAGAGAAAA 

TATCAAAATCTTAGAAGCAACCTTTGCTAGCTTTGGTATTAAGGTAACAGTTGAACGGGCCGAAATTGG 

GCCATCAGTGACCAAGTATGAAGTCAAGCCGGCTGTTGGTGTAAGGGTCAACCGCATTTCCAATCTATC 

AGATGACCTCGCTCTAGCCTTGGCTGCCAAAGATGTCCGGATTGAAGCACCAATCCCTGGGAAATCCCT 

AATCGGAATTGAAGTGCCCAACTCCGATATTGCCACTGTATCTTTCCGAGAACTATGGGAACAATCGCA 

AACGAAAGCAGAAAATTTCTTGGAAATTCCTTTAGGGAAGGCTGTTAATGGAACCGCAAG 

CCTTTCT A AAATGC C C C ACTTGCT AGTTGC AGGTTC AACGGGTTC AGGG AAGTC AG T AGC AGTT AACGG 

CATTATTGCTAGCATTCTCATGAAGGCGAGACCAGATCAAGTTAAATTTATGATGGTCGATCCCAAGAT 

GGTTGAGTTATCTGTTTACAATGATATTCCCCACCTCTTGATTCCAGTCGTGACCAATCCACGCAAAGC 

C AGC AAGGCTC TGC AAAAGGTTGTGG ATGAAATGGAAAACCGTT ATG AACTCTTTG CC AAGGTGGG AG T 

TCGGAATATTGCAGGTTTTAATGCCAAGGTAGAAGAGTTCAATTCCCAGTCTGAGTACAAGCAAATTCC 
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GCT AC C ATTC ATTGTCGTG ATTGTGG ATG AGTTGGCTG AC CTC ATG ATGGTGGC C AGC AAGG AAGTGG A 
AGATGCTATCATCCGTCTTGGGCAGAAGGCGCGTGCTGCAGGTATCCACATGATTCTTGCAACTCAGCG 
TCCATCTGTTGATGTCATCTCTGGTTTGATTAAGGCCAATGTTCCATCTCGTGTAGCATTTGCGGTTTC 
ATCAGGAACAGACTCCCGTACGATTTTGGATGAAAATGGAGCAGAAAAACTTCTTGGTCGAGGAGACAT 
GCTCTTTAAACCGATTGATGAAAATCATCCAGTTCGTCTCCAAGGCTCCTTTATCTCGGATGACGATGT 
TGAGCGCATTGTGAACTTCATCAAGACTCAGGCAGATGCAGACTACGATGAGAGTTTTGATCCAGGTGA 
GGTTTCTGAAAATGAAGGAGAATTTTCGGATGGAGATGCTGGTGGTGATCCGCTTTTTGAAGAAGCTAA 
GTCTTTGGTTATCGAAACACAGAAAGCCAGTGCGTCTATGATTCAGCGTCGTTTATCAGTTGGATTTAA 
CCGTGCGACCCGTCTCATGGAAGAACTGGAGATAGCAGGTGTCATCGGTCCAGCTGAAGGTACCAAACC 
TCGAAAAGTGTTACAACAA 

SP078 amino acid (SEQ ID NO: 132) 

RGFAKWVTCGHERRKEERFVKQEEKARQKAEKEARLEQEE 

PEEKWVEPEIILPQAELKFPEQEDDSDDEDVQVDFSAKEALEYKLPSLQLFAPDKPKDQSKEKKIVREN 
IKILEATFASFGIKVTVERAEIGPSVTK.YEVKPAVGVRVNRISNLSDD^ 

IGIEVPNSDIATVSFRELWEQSQTKAENFLEIPLGKAVNGTAPAFDLSKMPHLLVAGSTGSGXSVAVNG 
I IAS ILMKARPDQVKFMMVDPKMVELSVYNDI PHLLI PWTNPRXASKALQKWDEMENKYELFAKVGV 
RNIAGFNAKVEEFNSQSEYKQI PLPFIWIVDELADLMMVASKEVEDAI IRLGQKARAAGIHMILATQR 
PSVDVISGLIKANVPSRVAFAVSSGTDSRTILDENGAEKLLGRGDMLFKPIDENHPVRLQGSr I3DDDV 
ERIVNFIKTQADADYDESFDPGEVSENEGEFSDGDAGGDPLFEEAKSLVIETQKASASMIQRRL5VGFN 
RATRLMEELEIAGVIGPAEGTKPRKVLQQ 

SP079 nucleotide { SEQ ID NO: 133) 

TC AAAAAG AG AAGG AAAACTTGGTT ATTG C TGGG AAAAT AGGTC C AG AACC AG AAATTTTGGC C AAT AT 
GTATAAGTTGCTGATTGAAGAAAATACCAGCATGACTGCGACTGTTAAACCGAATTTTGGGAAGACAAG 
CTTCCTTTATGAAGCTCTGAAAAAAGGCGATATTGACATCTATCCTGAATTTACTGGTACGGTGACTGA 
AAGTTTGCTTCAACCATCACCCAAGGTGAGTCATGAACCAGAACAGGTTTATCAGGTGGCGCGTGATGG 
CATTGCTAAGCAGGATCATCTAGCCTATCTCAAACCCATGTCTTATCAAAACACCTATGCTGTAGCTGT 
TC CG AAAAAG ATTGCTCAAG AAT ATGGCTTGAAG AC CATTTC AG ACTTGAAAAAAGTGGAAGGG C AGTT 
GAAGGCAGGTTTTACACTCGAGTTTAACGACCGTGAAGATGGAAATAAGGGCTTGCAATCAATGTATGG 
TCTCAATCTCAATGTAGCGACCATTGAGCCAGCCCTTCGCTATCAGGCTATTCAGTCAGGGGATATTCA 
AATCACGGATGCCTATTCGACTGATGCGGAATTGGAGCGTTATGATTTACAGGTCTTGGAAGATGACAA 
GCAACTCTTCCCACCTTATCAAGGGGCTCCACTCATGAAAGAAGCTCTTCTCAAGAAACACCCAGAGTT 
GGAAAGAGTTCTTAATACATTGGCTGGTAAGATTACAGAAAGCCAGATGAGCCAGCTCAACTACCAAGT 
CGGTGTTGAAGGCAAGTCAGCAAAGCAAGTAGCCAAGGAGTTTCTCCAAGAACAAGGTTTGTTGAAGAA 

A 



SP079 amino acid (SEQ ID NO: 134) 

QKEKENLVIAGKIGPEPEIIJU^KLLIEENTSMTATVKPNFGKTSFLYEALKKGDIDIYPEFTGTVT 
SLLQPSPKVSHEPEQVYQVARDGIAKQDHLAYLKPMSYQNTYAVAVPKKIAQEYGLKTISDLKKVEGQL 
KAGFTLEFNDREDGNKGLQSMYGLNLNVATIEPALRYQAIQSGDIQITDAYSTDAELERYDLQVLEDDK 
QLFPPYQGAPLMKEALLKKHPELERVLNTLAGKITESQMSQLNYQVGVEGKSAKQVAKEFLQEQGLLKK 

SP080 nucleotide (SEQ ID NO:135) 

AC GTTC T ATTG AGGACCACTTTG ATTC AAAC TTC G AATTGG AAT AT AAC CTC AAAG AAAAAGGG AAAAC 
AG ATCTTTTG AAGCT AGTTG AT AAAAC AACTG AC ATGC GTC TGC ATTTT ATC C GC C AAACTC AT C C ACG 
CGGTCTCGGAGATGCTGTTTTGCAAGCCAAGGCTTTCGTCGGAAATGAACCTTTTGTCGTTATGCTTGG 
TGATGACTTGATGGATATCACAGACGAAAAGGCTGTTCCACTTACCAAACAACTCATGGATGACTACGA 
GCGTACCCACGCGTCTACTATCGCTGTCATGCCAGTCCCTCATGACGAAGTATCTGCTTACGGGGTTAT 
TGCTCCGCAAGGCGAAGGAAAAGATGGTCTTTACAGTGTTGAAACCTTTGTTGAAAAACCAGCTCCAGA 
GGACGCTCCTAGCGACCTTGCTATTATCGGACGCTACCTCCTCACGCCTGAAATTTTTGAGATTCTCGA 
AAAGCAAGCTCCAGGTGCAGGAAATGAAATTCAGCTGACAGATGCAATCGACACCCTCAATAAAACACA 
ACGTGTATTTGCTCGTGAGTTCAAAGGGGCTCGTTACGATGTCGGAGACAAGTTTGGCTTCATGAAAAC 
ATCCATCGACTACGCCCTCAAACACCCACAAGTCAAAGATGATTTGAAGAATTACCTCATCCAACTTGG 

AAAAGAATTGACTGAGAAGGAA 
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SP080 amino acid (SEQ ZD NO: 136) 

RSIEDHFDSNFELEYNLKEKGKTDLLKLVDKTTDMRLHFIRQTHPRGLGDAVLQAKAFVGNEPFVVMLG 
DDIJ1DITDEKAVPLTKQLOTDYERTHASTIAVMPVPHDEVSAYGVIAPQGEGKIX3LYSVETFVEKPAPE 
DAPSDLAIIGRYLLTPEIFEILEKQAPGAGNEIQLTDAIDTLNKTQRVFAREFKGARYDVGDKFGFMKT 
SIDYALKHPQVKDDLKNYLIQLGKELTEKE 

SP081 nucleotide (SBQ ZD HO: 137) 

CGCTCAAAATACCAGAGGTGTTCAGCTAATCGAGCACGTTT^ 
GAGTGTCTTTTCTGATATTCCACCTCAGGCTGTAAAAACTGGAATGTTGGCT 

AATCATCCAACCCTATCTTAAAAAACTGGATTGTCCCTATGTCCTTGATCCTGTTATGGTTGCTACAAG 

TGGAGATGCCTTGATTGACTCAAATGCTAGAGACTATCTCAAAACAAACTTACTACCTCTAGCAACTAT 

TATTACGCCAAATCTTCCTGAAGCAGAAGAGATTGTTGGTTTTTCAATCCATGACCCCGAAGACATGCA 

GCGTGCTGGTCGCCTGATTTTAAAAGAATTTGGTCCTCAGTCTGTGGTTATCAAAGGCGGACATC'PCAA 

AGGTGGTGCTAAAGATTTCCTCTTTACCAAGAATGAACAATTTGTCTGC^ 

CTGTCACACCCATGGTACT 

SP081 amino acid (SEQ ZD HO: 138) 

AQ^RGVQLIEHVSPQMLKAQLESWSDIPPQAVKTGMUVTTEIMEIIQPYLKKLDCPr/LDPVMVATS 
GDALIDSNARDYLKTNLLPLATIITPNLPZAEEIVGFSIHDPEDMQRAGRLILKEFGPQSWIKGGHLK 
GGAKDFLFTKNEQFVWESPRIQTCKTHGT 

SP082 nucleotide (SBQ ZD NO: 139) 

AATTGTACAATTAGAAAAAGATAGCAAATCAGACAAAGAACAAGTTGATAAACTATTTGAATCATTTGA 

TGCATCTTCAGATGAATCTATTTCTAAATTAAAAGAACTATCTGAAACTTCACTTAAAACCGATGCAGG 

TAAAGACTATCTTAATAACAAAGTCAAAGAATCATCTAAAGCAATTGTAGATTTTCATTTGCAAAAAGG 

TTTGGCTTATGATGT^AAAGATTCAGATGACAA^TTTAAAGATAAAGCAACTCTTC 

AGAAATTACAAAACAAATTGATTTTATCAAAAAAGTTGATGAAACTTTTAAACAAGAGAATTTG^ 

AACTCTTAAATCTCTAAATGATCTTGTTGATAAATATCAAAAACAAATCGAACTTTTGAAGAAAGAAGA 

AGAAAAAGCTGCTGAAAAAGCTGCTGAAAAAGCAAAGGAATCTTCTAGTCAAAGTAATTCTTCTGGTAG 

TGCTTCTAATGAGTCTTATAATGGATCTTCCAATTCAAATGTAGATTATAGTTCATCTGAACAAACTAA 

TGGATATTCAAATAATTATGGCGGTCAAGATTATTCTGGTTCAGGAGATAGTTCAACAAATGGTGGATC 

ATCAGAACAATATTCATCTAGCAATTCAAACAGCGGAGCAAATAATGTCTACAGATATAAAGGCACTGG 

TGCTGACGGCTATCAAAGATACTACTACAAAGATCATAATAATGGAGATGTGTATGATGACGATGGAAA 

TTACCTTGGGAACTTTGGTGGCGGCATTGCAGAACCTAGTCAACGC 

SP082 amino acid (SEQ ZD NO:140) 

IVQLEKDSKSDKEQVDKLFESFDASSDESISKLKELSETSLKTDAGKDYLNNKVKESSKAIVDFHLQKG 
LAYDVKDSDDKFKDKATLET^A^CEITKQIDFIKKVDETFKQENLEETLKSL^^DLVDKYQKQIELLKKEE 
EKAAEKAAEKAKESSSQSNSSGSASNESYNGSSNSNVDYSSSEQTNGYSNNYGGQDYSGSGDSSTNGGS 
SEQYSSSNSNSGANNVYRYKGTGADGYQRYYYKDHNNGDVYDDDGNYLGNFGGGIAEPSQR 

SP083 nucleotide (SEQ ZD NO: 141) 

TCTGACCAAGCAAAAAGAAGCAGTCAATGACAAAGGAAAAGCAGCTGTTGTTAAGGTGGTGGAAAGCCA 
GGC AGAAC TTTAT AGC TT AG AAAAG AATG AAG ATGC T AGCC T AAG AAAG TT AC AAGC AG ATGG AC GC AT 
CACGGAAGAACAGGCTAAAGCTTATAAAGAATACAATGATAAAAATGGAGGAGCAAATCGTAAAGTCAA 
TGAT 

SP083 amino acid (SEQ ZD NO: 142) 

LTKQKEAVNDKGKAAVVKWESQAELYSLEKNEDASLRXLQADGRITEEQAKAYKEYNDKNGGA2TOKW 
D 

SP084 nucleotide (SEQ ZD NO: 143) 

GTCCGGCTCTGTCCAGTCCACTTTTTCAGCGGTAGAGGAACAGATTTTCTTTATGGAGTTTGAAGAACT 
CTATCGGGAAACCCAAAAACGCAGTGTAGCCAGTCAGCAAAAGACTAGTCTGAACTTAGATGGGCAGAC 
GCTTAGCAATGGCAGTCAAAAGTTGCCAGTCCCTAAAGGAATTCAGGCCCCATCAGGCCAAAGTATTAC 
ATTTG AC C G AGC TGGGGGC AATT CGTC CCTGGC T AAGGTTG AATTTC AG AC C AGT AAAGG AGC G ATTCG 
C T ATC AATT AT ATCT AGGAAATGG AAAAATT AAAC GC AT T AAGG AAAC AAAAAAT 
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SP084 amino acid (SEQ ZD NO: 144 ) 

SGSVQSTFSAVEEQIFFMEFEELYRETQKRSVASQQKTSLNLDGQTLSNGSQKLPVPKGIQAPSGQSIT 
FDRAGGNSSLAKVEFQTSKGAIRYQLYLGNGKIKRIKETKN 

SP085 nucleotide (SEQ ID NO: 145) 

GGGACAAATTCAAAAAAATAGGCAAGAGGAAGCAAAAATCTTGCAAAAGGAAGAAGTCTTGAGGGTAG^ 
TAAGATGGCCCTGCAGACGGGGCAAAATCAGGTAAGCATCAACGGAGTTGAGATTCAGGTATTTTCTAG 
TGAAAAAGGATTGGAGGTCTACCATGGTTCAGAACAGTTGTTGGCAATCAAAGAGCCA 

SP085 amino acid (SEQ ID NO:146) 

GQIQKNRQEEAKILQKEEVLRVAKMALQTGQNQVSINGVEIQVFSSEKGLEVYHGSEQLLAIKEP 
SP086 nucleotide (SEQ ID NO: 147) 

TCGCTACCAGCAACAAAGCGAGCAAAAGGAGTGGCTCTTGTTTGTGGACCAACTTGAGGTAGAATTAGA 
CCGTTCGCAGTTCGAAAAAGTAGAAGGCAATCGCCTATACATGAAGCAAGATGGCAAGGACATCGCCAT 
CGGT AAGTC AAAGTC AG ATG ATTTC CGT AAAA C G AATGC TCGTGGTCGAGGTT ATC AGC C T ATGGTTT A 
TGGACTCAAATCTGTACGGATTACAGAGGACAATCAACTGGTTCGCTTTCATTTCCAGTTCCAAAAAGG 
CTTAGAAAGGGAGTTCATCTATCGTGTGGAAAAAGAAAAAAGT 

SP086 amino acid (SEQ ID NO:148) 

RYQQQSEQKEWLLFVDQLEVELDRSQFEKVEGNRLYMKQDGKDIAIGKSKSDDFRKTNARGRGYQPMVY 
GLKSVRITEDNQLVRFHFQFQKGLEREFIYRVEKEKS 

SP087 nucleotide (SEQ ID NO:149) 

GAACCGACAAGTCGCCCACTATCAAGACTATGCTTTGAATAAAGAAAAATTGGTTGCTTTTGCTATGGC 
T AAACG AAC C AAAG AT AAGGTTG AGC AAG AAAGTGGGG AAC AG TTTTTT AA TCT AGG TC AGGT AAGC T A 
TC AAAAC AAG AAAACTGGCTT AGTG ACGAGGGTTCGT ACGG AT AAGAG CC AAT ATG AGTTT CTG TTTC C 
TTCAGTCAAAATCAAAGAAGAGAAAAGAGATAAAAAGGAAGAGGTAGCGACCGATTCAAGCGAAAAAGT 
GGAGAAGAAAAAATCAGAAGAGAAGCCTGAAAAGAAAGAGAATTCA 

SP087 amino acid (SEQ ID NO: 150) 

NRQVAHYQDYALNKEKLVAFAMAKRTKDKVEQESGEQFFNLGQVSYQNKKTGLVTRVRTDKSQYEFLFP 
SVKIKEEKRDKKEEVATDSSEKVEKKKSEEKPEKKENS 

SP088 nucleotide (SEQ ID NO: 151) 

GGTTGTCGGCTGGCAATATATCCCGTTTCCATCTAAAGGTAGTACAATTGGTCCTTACCCAAATGGTAT 
C AG ATT AG AAG GTTTTCC AAAGTC AG AG TGG T AC T ACTTC G AT AAAAATGG AGTGC T AC AAG AGTTTG T 
TGGTTGGAAAACATTAGAGATTAAAACTAAAGACAGTGTTGGAAGAAAGTACGGGGAAAAACGTGAAGA 
TTCAGAAGATAAAGAAGAGAAGCGTTATTATACGAACTATTACTTTAATCAAAATCATTCTTTAGAGAC 
AGGTTGGC TT T ATG ATC AG TC T AAC TGGT ATT ATC T AGCT AAG ACGG AAATT AATGG AG AAAACT AC C T 
TGGTGGTGAAAGACGTGCGGGGTGGATAAACGATGATTCGACTTGGTACTACCTAGATCCAACAACTGG 
TATTATGCAAAC AGGTTGGC AAT ATC T AGGT AAT AAGTGGT ACT AC CTCCGTTCCTC AGG AGC AATGGC 
CACTGGCTGGTATCAGGAAGGTACCACTTGGTATTATTTAGACCACCCAAATGGCGATATGAAAACAGG 
TTGGC AAAAC CTTGGG AAC AAATGGT ACT ATCTCCGTT CATC AGG AGCT ATGGC AAC TGG TTGGT ATC A 
AGATGGTTCAACTTGGTACTACCTAAATGCAGGTAATGGAGACATGAAGACAGGTTGGTTCCAGGTCAA 
TGGC AACTGGT AC TATGCTT AT AGC TCAGGTGC TTTGGC AGTG AAT ACGACCG TAG ATGGC TATTCTG T 
C AAC T AT AATGGCG AATGGGTTC GG 

SP088 amino acid ( SEQ ID NO: 152) 

WGWQYIPFPSKGSTIGPYPNGIRLEGFPKSEl^TYYFDKNGVLQEFVGWKTLEIKTKDSVGRKYGEKRED 
SEDKEEKRYYT^HTFNQNHSLETGV^YDQSNWYYLAKTEINGEOT 
IM^GWQYLGNKWYYLRSSGAMATGVATQEGTT^^ 
DGSTWYYLNAGNGDMKTGWFQVNGNWYYAYSSGALAVNTTVD^ 

SP089 nucleotide (SEQ ID NO: 153) 

GGCCAAATCAGAATGGGTAGAAGACAAGGGAGCCTTTTATTATCTTGACCAAGATGGAAAGATGAAAAG 
AAATGCTTGGGTAGGAACTTCCTATGTTGGTGCAACAGGTGCCAAAGTAATAGAAGACTGGGTCTATGA 
TTCTC AAT ACG ATG C TTGGTTTT AT ATC AAAGC AG ATGG AC AGC ACGC AG AG AAAGAATGGCTC C AAAT 
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TAAAGGGAAGGACTATTATTTCAAATCCGGTGGTTATCTACTGACAAGTCAGTGGATTAATCAAGCTTA 

TGTGAATGCTAGTGGTGCCAAAGTACAGCAAGGTTGGCTTTTTGACAAACAATACCAATCTTGGTTOT 

CATCAAAGAAAATGGAAACTATGCTGATAAAGAATGGATTTTCGAGAATGGTCACTATTATTATCTAAA 

ATCCGGTGGCTACATGGCAGCCAATGAATGGATTTGGGATAAGGAATCTTGGTT^ 

TGGGAAAATGGCTGAAAAAGAATGGGTCTACGATTCTCATAGTCAAGCTTGGTACTACTTCAAATCCGG 

TGGTTACATGACAGCCAATGAATGGATTTGGGATAAGGAATCTTGGTTT^ 

AATAGCTGAAAAAGAATGGGTCTACGATTCTCATAGTCAAGCTTGGTACTACTTCAAATC^ 

CATGACAGCCAATGAATGGATTTGGGATAAGGAATCTTGGTTTTACCTCAAATCTC 

TGAAAAAGAATGGGTCTACGATTCTCATAGTCAAGCTTGGTACTACTTCAAATCTGGTGGCTACATGGC 

GAAAAATGAGACAGTAGATGGTTATCAGCTTGGAAGCGATGGTAAATGGCTTG^ 

TGAAAATGCTGCTTACTATCAAGTAGTGCCTGTTACAGCCAATGTTTATGATTCAGATGGTGAAAAGCT 
TTCCTATATATCGCAAGGTAGTGTCGTATGGCTAGATAAGGATAGAAAAAGTGATGACAAGCGCTTGGC 
TATTACTATTTCTGGTTTGTCAGGCTATATGAAAACAGAAGATTTACAAGCGCTAGATGCTAGTAAGGA 
CTTTATCCCTTATTATGAGAGTGATGGCCACCGTTTTTATCACTATGTGGCTC 

AGTAGCTTCTCATCTTTCTGATATGGAAGTAGGCAAGAAATATTATTCGGCAGATGGCCTGCATTTTGA 
TGGTTTTAAGCTTGAGAATCCCTTCCTTTTCAAAGATTTAACAGAGGCTACAAACTACAGTC 
ATTGG AT AAGGT ATTT AGTTTGC T AAAC ATT AAC AAT AGC C TTTTGG AG AAC AAGGGC GC T ACTTTT AA 
GGAAGCCGAAGAACATTACCATATCAATGCTCTTTATCTCCTTGCCCATAGTGCCCTAGAAAGTAACTG 
GGG AAG AAGT AAAATTGCC AAAG AT AAG AAT AATTTC TTTGGC ATT AC AGC C T ATG AT ACG AC C C C TT A 
CCTTTCTGCTAAGACATTTGATGATGTGGATAAGGGAATTTTAGGTGCAACCAAGTGGATTAAGGAAAA 
TT AT ATC GAT AGGGGAAGAACTTTCCTTGG AAAC AAGGC TTCTGG TATGAATG TGG AAT ATGC TTC AG A 
C C CTT ATTGGGGCGAAAAAATTGCT AG TGTG ATG ATG AAAATC AATG AG AAG 

SP089 amino acid ( SEQ ID NO: 154) 

AKS EWVEDKGAFYYLDQDGKMKRNAWVGTS YVGATGAKVI EDWVYDSQYDAWF Y I KADGQHAEKEWLQI 

KGKDYYFKSGGYLLTSQWINQAYVNASGAKVQQGWLFDKQYQSWFYIKENGNYADKEWIFENGHYYYLK 

SGGYMAANEWIVTOKESV^YLKFDGKMAEKEWVYDSHSQAWYYFKSC^ 

IAEKEWVYDSHSQAOTYFKSGGYMTANEWIWDKESWFYLKSDGKIAEK^^ 

KNETVDGYQLGSDGKWLGGKTTNENAAYYQWPVTANVYDSDG^^ 

ITI SGLSG YMKTEDLQALDASKDFI P YYESDGHRFYH YVAQNAS I PVASHLSDMEVGKKYYS ADGLHFD 

GFKLENPFLFKDLTEATNYSAEELDKVFSLLNINNSLLENKGATFKEAEEHYHINAiYLIAHSALESNW 

GRSKIAKDKNNFFGITAYimTYLSAKTFDDVDKGIUSAT^ 

PYWGEKIASVMMKINEK 

SP090 nucleotide { SEQ ID MO: 155) 

ATTTGC AG ATG ATTCTG AAGG ATGGC AGTTTGTC C AAG AAAATGGT AG AAC CT AC T AC AAAAAGGGGG A 
TC T AAAAG AAAC C T ACTGG AG AGTG AT AG ATGGG AAGT ACT ATT ATTTTG ATC C TTT ATC C GG AG AG AT 
GGTTGTCGGCTGGCAATATATACCTGCTCCACACAAGGGGGTTACGATTGGTCCTTCTCCAAGAATAGA 
GATTGCTCTTAGACCAGATTGGTTTTATTTTGGTCAAGATGGTGTATTACAAGAATTTGTTGGCAAGCA 
AGTTTTAGAAGCAAAAACTGCTACGAATACCAACAAACATCATGGGGAAGAATATGATAGCCAAGCAGA 
G AAAC G AGTC T ATT ATTTTG AAG ATC AGCGTAGTT ATC AT ACTTTAAAAAC TGG TTGG ATTT ATG AAG A 
GGGTCATTGGTATTATTTACAGAAGGATGGTGGCTTTGATTCGCGCATCAACAGATTGACGGTTGGAGA 
GCTAGCACGTGGTTGGGTTAAGGATTACCCTCTTACGTATGATGAAGAGAAGCTAAAAGCAGCTCCATG 
GTACTATCTAAATCCAGCAACTGGCATTATGCAAACAGGTTGGCAATATCTAGGTAATAGATGGTACTA 
CCTCCATTCGTCAGGAGCTATGGCAACTGGCTGGTATAAGGAAGGCTCAACTTGGTACTATCTAGATGC 
TGAAAATGGTGATATGAGAACTGGCTGGCAAAACCTTGGGAACAAATGGTACTATCTCCGTTCATCAGG 
AGCTATGGCAACTGGTTGGTATCAGGAAAGTTCGACTTGGTACTATCTAAATGCAAGTAATGGAGATAT 
G AAAAC AGGCTGGTTC C AAGTC AATGGT AAC TGGT AC T ATGC CT ATG ATTCAGG TGC TTT AGCTGTTAA 
TACCACAGTAGGTGGTTACTACTTAAACTATAATGGTGAATGGGTTAAG 

SP090 amino acid (SEQ ID NO: 156) 

VFADDSEGWQFT/QENGRTYYKKGDLKETYV^VIIXSKYYYFDPLSGEMWGWQYIPAPHKGVTIGPSPRI 

EIALRPDWFYFGQDGVLQEFVGKQVLEAKTAWTNKHHGEEYDSQAEKRVYYFEDQRSYHTLKTGWIYE 

EGhVYYLQKIX^FDSRINRLTVGELARGWKDYPLTTO^ 

YLHSSGAMATGWYKEGSWTfLDAENGDMRTGWQ^ 

MKTGWFQVNGNWYYAYDSGALAVNTTVGGYYLNYNGEWVK 
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SP091 nucleotide (SEQ ID NO:157) 

TGTCGCTGCAAATGAAACTGAAGTAGCAAAAACTTCGCAGGATACAACGACAGCTTCAAGTAGTTCAGA 
GC AAAATC AGTCTTCT AAT AAAAC GC AAACG AGC GC AG AAGT AC AG AC T AATGCTG CTGC C C ACTGGG A 
TGGGG ATT ATT ATGTAAAGG ATG ATGGTTC T AAAGCTC AAAGTG AATGG ATTTTTG AC AAC T ACT AT AA 
GGCTTGGTTTTATATTAATTCAGATGGTCGTTACTCGCAGAATGAATGGCATGGAAATTACTACCTGAA 
ATCAGGTGGATATATGGCCCAAAACGAGTGGATCTATGACAGTAATTACAAGAGTTGGTTTTATCTCAA 
GTCAGATGGGGCTTATGCTCATCAAGAATGGCAATTGATTGGAAATAAGTGGTACTACTTCAAGAAGTG 
GGGTTACATGGCTAAAAGCCAATGGCAAGGAAGTTATTTCTTGAATGGTCAAGGAGCTATGATGCAAAA 
TGAATGGCTSCTATGATCCAGCCTATTCTGCTTATTTTTATCTAAAATCCGATGGAACTTATGCTAACC 
AAGAGTGGCAAAAAGTGGGCGGCAAATGGTACTATTTCAAGAAGTGGGGCTATATGGCTCGGAATGAGT 
GGCAAGGCAACTACTATTTGACTGGAAGTGGTGCCATGGCGACTGACGAAGTGATTATGGATGGTACTC 
GCT AT ATCTTTGCGGC CTCTGG TG AGC TC AAAG AAAAAAAAG ATTTG AATGTCGGCTGGGTTC AC AG AG 
ATGGTAAGCGCTATTTCTTTAATAATAGAGAAGAACAAGTGGGAACCGAACATGCTAAGAAAGTCATTG 
ATATTAGTGAGCACAATGGTCGTATCAATGATTGGAAAAAGGTTATTGATGAGAACGAAGTGGATGGTG 
TCATTGTTCGTCTAGGTTATAGCGGTAAAGAAGACAAGGAATTGGCGCATAACATTAAGGAGTTAAACC 
GTCTGGGAATTCCTTATGGTGTCTATCTCTATACCTATGCTGAAAATGAGACCGATGCTGAGAGTGACG 
CT AAAC AG AC C ATTG AAC TT AT AAAG AAAT AC AAT ATG AAC C TG TC TT AC C C T ATC T ATT ATG ATGTTG 
AGAATTGGGAATATGTAAATAAGAGCAAGAGAGCTCCAAGTGATACAGGCACTTGGGTTAAAATCATCA 
ACAAGTACATGGACACGATGAAGCAGGCGGGTTATCAAAATGTGTATGTCTATAGCTATCGTAGTTTAT 
TACAGACGCGTTTAAAACACCCAGATATTTTAAAACATGTAAACTGGGTAGCGGCCTATACGAATGCTT 
TAGAATGGGAAAACCCTCATTATTCAGGAAAAAAAGGTTGGCAATATACCTCTTCTGAATACATGAAAG 
GAATCC AAGGGC GC GT AG ATGTC AGCG TTTGG TAT 

SP091 amino acid (SEQ ID NO:158) 

VAANETEVAKTSQDT^TASSSSEQNQSSNKTQTSAEVQTN^ 

AV^YINSDGRYSQNEWHGNYYLKSGGYMAQNEWIYDSNYKSWFYLXSDGAYAHQEWQLIGNKWYYFKKW 
GYMAKSQWQGSYFI^GQGAMMQNEWLYDPAYSAYFYLKSDGTY^^ 

QGNYYLTGSGAMATDEVIMDGTRYIF AASGELKEKKDLWGWVHRDGKRYF FNNREEQVGTEHAKKVID 
ISEHNGRINDWKKVIDENEVDGVIVRLGYSGKEDKEIAHNIKELNRLGIPYGWLYTYAENETDAESDA 
KQT I E L I KKYNMNLS Y P I YYDVENW E YVNK S KRA P S DTGTWVK 1 1 NK YMDTMKQ AG Y QNVYVY S YRS L L 
QTRLKHPDILKHVNWVAAYTOALEWENPHYSGXKGWQYTSSEYMKGIQGRVDVSVWY 

SP092 nucleotide (SEQ ID NO: 159) 

TACGTCTCAGCCTACTTTTGTAAGAGCAGAAGAATCTCCACAAGTTGTCGAAAAATCTTCATTAGAGAA 

GAAATATGAGGAAGCAAAAGCAAAAGCTGATACTGCCAAGAAAGATTACGAAACGGCTAAAAAGAAAGC 

AGAAGACGCTCAGAAAAAGTATGAAGATGATCAGAAGAGAACTGAGGAGAAAGCTCGAAAAGAAGCAGA 

AGCATCTCAAAAATTGAATGATGTGGCGCTTGTTGTTCAAAATGCATATAAAGAGTACCGAGAAGTTCA 

AAATCAACGTAGTAAATATAAATCTGACGCTGAATATCAGAAAAAATTAACAGAGGTCGACTCTAAAAT 

AGAGAAGGCTAGGAAAGAGCAACAGGACTTGCAAAATAAATTTAATGAAGTAAGAGCAGTTGTAGTTCC 

TG AAC C AAATGCGTTGG CTG AG AC T AAG AAAAAAGC AG AAG AAG CT AAAGC AG AAG AAAAAG T AG CT AA 

GAGAAAATATGATTATGCAACTCTAAAGGTAGCACTAGCGAAGAAAGAAGTAGAGGCTAAGGAACTTGA 

AATTGAAAAACTTCAATATGAAATTTCTACTTTGGAACAAGAAGTTGCTACTGCTCAACATCAAGTAGA . 

TAATTTGAAAAAACTTCTTGCTGGTGCGGATCCTGATGATGGCACAGAAGTTATAGAAGCTAAATTAAA 

AAAAGGAGAAGCTGAGCTAAACGCTAAACAAGCTGAGTTAGCAAAAAAACAAACAGAACTTGAAAAACT 

TCTTG AC AGC C TTG ATCC TG AAGGT AAGAC TC AGG ATG AATT AG AT AAAG AAGC AG AAG AAGC TG AGTT 

GG AT AAAAAAGCTG ATG AACTTCAAAATAAAGTTGCTGATTTAG AAAAAG AAATTAGT AAC CTTG AAAT 

ATT AC TTGGAGGGGCTGATNCTG AAG ATG AT ACTG CTGC TCTTCAAAAT AAAT TAGCT ACT AAAAAAGC 

TG AATTGG AAAAAACTC AAAAAG AATT AG ATGC AGC TCTTAATG AGTT AGGCC CTG ATGG AG ATG AAG A 

AGAAACTCCAGCGCCGGCTCCTCAACCAGAGCAACCAGCTCCTGCACCAAAACCAGAGCAACCAGCTCC 

AGCTCCAAAACCAGAGCAACCAGCTCCTGCACCAAAACCAGAGCAACCAGCTCCAGCTCCAAAACCAGA 

GC AAC CAGCTCCAGCTCC AAAAC C AG AGC AACC AGCT AAGC CGG AG AAAC C AGC TG AAG AGC C T ACTC A 

ACC AGAAAAAC C AGC C AC TC C AAAAACAGGCTGG AAAC AAG AAAAC GGTATGTGGTATTTC T AC AAT AC 

TG ATGGTTC AATGGC AAT AGG TTGGCTCC AAAAC AACGGTTCATGGT ACT AC CT AAAC GCT AAC GGCGC 

TATGGCAACAGGTTGGGTGAAAGATGGAGATACCTGGTACTATCTTGAAGCATCAGGTGCTATGAAAGC 

AAGC CAATGGTTCAAAGT ATC AG AT AAATGG TACT ATGTC AAC AGC AATGGCGCTATGGCG AC AGGCTG 

GCTCCAATACAATGGCTCATGGTACTACCTCAACGCTAATGGTGATATGGCGACAGGATGGCTCCAATA 

CAACGGTTCATGGTATTACCTCAACGCTAATGGTGATATGGCGACAGGATGGGCTAAAGTCAACGGTTC 

ATGGTACTACCTAAACGCTAACGGTGCTATGGCTACAGGTTGGGCTAAAGTCAACGGTTCATGGTACTA 
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CCTAAACGCTAACGGTTCAATGGCAACAGGTTGGGTGAAAGATGGAGATACCTGGTACTATCTTGAAGC 
ATCAGGTGCTATGAAAGCAAGCCAATGGTTCAAAGTATCAGATAAATGGTACTATGTCAATGGCTTAGG 
TGCCCTTGCAGTCAACACAACTGTAGATGGCTATAAAGTCAATGCCAATGGTGAATGGGTT 

SP092 amino acid (SEQ ID NO: ISO) 

TSQPTFVRAEESPQVVEXSSLEKKY^EAKAKAOTAKKDYETA 
ASQKUJDVALWQNAYKEYREVQNQRSKYKSDAEYQKKLTE^ 

EPNAIAETKKKAEEAKAEEKVAKRKYDYATLKVALAKKEVEAKELEIEKLQYEISTLEQEVATAQHQVD 
NLKKLIJVGADPDDGTEVIEAKLKKGE1AELNAKQAELAKKQTELEKLLDSLDPEGKTQDELDKEAEEAEL 
DKKADELQNKVADLEKEISNLEILLGGADXEDDTAA^ 

ETPAPAPQPEQPAPAPKPEQPAPAPKPEQPAPAPKPEQPAPAPKPEQPAPAPKPEQPAKPEKPAEEPTQ 

PEKPATPKTGWKQENGMWYFYNTDGSMAIGWLQ^GSWYYLNANGAMATGWVKTC 

S QWF KV S DKWYYVNSNG AMATGWLQ YNG SWYY LN ANG DMATGWLQ YNG SWYY LNANGDMATGWAKVNG S 

WYYLNANGAMATGWAKVNG SWYY LN ANG 5 MATGWVKDG DTWYY LEA SG AMKASQWF KVS DKWYYVNG LG 

A LAVNTTVDG YKVNANG EWV 

P093 nucleotide ( SEQ ID MO: 161) 

TGG AC AGGTG AAAGGTC ATGC T AC ATTTGTG AAATC C ATG AC AAC TG AAA TGT AC C AAG AAC AAC AG AA 
CCATTCTCTCGCCTACAATCAACGCTTGG^^TCGCAAAATCGCATTGTAGATCCTTTTTTGGCGGAGGG 
ATATGAGGTCAATTACCAAGTGTCTGACGACCCTGATGCAGTCTATGGTTACTTGTCTATTCCAAGTTT 
GGAAATCATGGAGCCGGTTTATTTGGGAGCAGATTATCATCATTTAGGGATGGGCTTGGCTCATGTGGA 
TGGTACACCGCTGCCTCTGGATGGTACAGGGATTCGCTCAGTGATTGCTGGGCACCGTGCAGAGCCAAG 
CCATGTCTTTTTCCGCCATTTGGATCAGCTAAAAGTTGGAGATGCTCTTTATTATGATAATGGCCAGGA 
AATTG T AG AAT ATC AG ATG ATGG AC AC AG AG ATT ATTTT AC C GT C GG AATGGG AAAAATT AG AATCGGT 
TAGCTCTAAAAATATCATGACCTTGATAACCTGCGATCCGATTCCTACCTTTAATAAACGCTTATTAGT 
GAATTTTGAACGAGTCGCTGTTTATCAAAAATCAGATCCACAAACAGCTGCAGTTGCGAGGGTTGCTTT 
T ACG AAAG AAGG AC AATCTGT ATCGCG TGTTGC AAC C TCTC AATGGTTG 

SF093 amino acid (SEQ ID HO: 162) 

GQVKGHATFVKSMTTEMYQEQQ^SLAYNQRLXSQNRIVDPFIAEGYEVNYQVSDDPDAVYGYLSIPSL 
EIMEPVYLGADYHHLGMGLAHVDGTPLPLDGTGIRSVIAGHRAEPSHVFFRHLDQLKVGDALYYDNGQE 
IVEYQMMDTEIILPSEWEKLESVSSKNIMTLITCDPIPTFNKRLLVNF ERVAVYQKSDPQTAAVARVAF 
TKEGQSVSRVATSQWL 

SP094 nucleotide (SEQ ID NO: 163) 

GATTGCTCCTTTGAAGGATTTGAGAGAAACCATGTTGGAAATTGCTTCTGGTGCTCAAAATCTTCGTGC 
CAAGGAAGTTGGTGCCTATGAACTGAGAGAAGTAACTCGCCAATTTAATGCTATGTTGGATCAGATTGA 
TCAGTTGATGGTAGCTATTCGTAGCCAGGAAGAAACGACCCGTCAGTACCAACTTCAAGCCCTTTCGAG 
CC AG ATT AATCC AC ATTTC C TCT AT AAC A CTTTGG AC ACC A TC ATC TGG ATGGC TG AATTTC ATG AT AG 
TCAGCGAGTGGTGCAGGTGACCAAGTCCTTGGCAACCTATTTCCGCTTGGCGCTCAATCAAGGCAAGGA 
CTTGATTTGTCTCTCTGACGAAATCAATCATGTCCGCCAGTATCTCTTTATCCAGAAACAACGCTATGG 
AG AT AAGC TGG AAT ACG AAATT AATG AAAATGTTGC CTTTG AT AATTT AGTCTT ACCC AAGC TGGTC CT 
ACAACCCCTTGTAGAAAATGCTCTTTACCATGGCATTAAGGAAAAGGAAGGTCAGGGCCATATTAAACT 
TTCTGTCCAGAAACAGGATTCGGGATTGGTCATCCGTATTGAGGATGATGGCGTTGGCTTCCAAGATGC 
TGGTG AT AGT AGTC AAAGTC AACTC AAACGTGGGGG AGTTGGTC TTC AAAATGT CG ATC AAC GGC TC AA 
ACTTC ATTTTGG AGC C AATT ACC AT ATG AAG ATTG ATTCT AG AC C C CAAAAAGGG ACG AAAG TTGAAAT 
ATATATAAATAGAATAGAAACTAGC 

SP094 amino acid (SEQ ID NO: 164) 

IAPLKDLRETMLEIASGAQNLRAKEVGAYELREVTRQFNAMLDQIDQLMVAIRSQEETTRQYQLQALSS 
QINPHFLYNTLDTIIWMAEFHDSQRWQVTKSLATYFRLALNOGKDLICLSDEINHVRQYLFIQKQRYG 
DKLEYEINENVAFDNtVLPKLVLQPLVENALYHGIKEKEGOGHIKLSVQKQDSGLVIRIEDDGVGFQDA 
GDSSQSQLKRGGVGLQNVDQRLKLHFGANYHMKIDSRPQKGTKVEIYINRIETS 

SP095 nucleotide (SEQ ID NO: 165) 

TAGGTCATATCGGACTTTTTTTCTACAACAAAATAGGCTCCATAATATCTATAAGGGATTTACCCACTA 
CAAATATTATAGAGCCGAAAATTCACATCTAATATATGCAGACTACTTTGAAATG AAATT AAAAAAATT 
ATTAAAGGATGACACAAAAGTTTTTGAAAAATCTACATTCAAATTTGTAGAAGGATATAAAATATACCT 
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GACAGAATCTAAAGAATCTGGAATTAAACAAATGGACAATGTCATAAAATATTTTGAGTTTATTGAATC 
TAAAAGTATTGCTTTATATTTTCAAAAACGATTAAATGAGCTGATAGAT 

SP095 amino acid {SEQ ID NO: 166) 

RSYGTFFLQQNRLHNIYKGFTHYKYYRAENSHLIYAD^EMKLKKLLKDDTKVFEKSTFKFVEGYKIYL 
TESKESGIKQMDNVIKYFEFIESKSIALYFQKRLNELID 

SF096 nucleotide (SEQ ID NO: 167) 

CAACGTTGAGAATTATTTGCGAATGTGTTTGGATAGCATTCAGAATCAGACGTATCAAAATTTTGAGTG 
TTTATTAATCAATGATGGCTCTCCAGATCATTCATCCAAAATATGTGAAGAATTTGTAGAGAAAGATTC 
TCGTTTCAAATATTTTGAGAAAGCAAACGGCGGTCTTTCATCAGCTCGTAACCTAGGTATTGAATGTTC 
GGGGGGGGGCGTACATTACTTTTGTAGACTC 

SP096 amino acid (SEQ ID NO:168) 

WEOTLRMCLDSIQNQTYQNFECLLINDGSPDHSSKICEEFVEKDSRFKYFEKANGGLS3ARNLGIECS 
GGGVHYFCRL 

SP097 nucleotide { SEQ ID NO:169) 

CTACTATCAATCAAGTTCTTCAGCCATTGAGGCCACCATTGAGGGCAACAGCCAAACGACCATCAGCCA 
GACTAGCCACTTTATTCAGTCTTATATCAAAAAACTAGAAACCACCTCGACTGGTTTGACCCAGCAGAC 
GGATGTTCTGGCCTATGCTGAGAATCCCAGTCAAGACAAGGTCGAGGGAATCCGAGATTTGTTTTTGAC 
CATCTTGAAGTCAGATAAGGACTTGAAAACTGTTGTGCTGGTGACCAAATCTGGTCAGGTCATTTCTAC 
AGATGACAGTGTGCAGATGAAAACTTCCTCTGATATGATGGCTGAGGATTGGTACCAAAAGGCCATTCA 
TCAGGGAGCTATGCCTGTTTTGACTCCAGCTCGTAAATCAGATAGTCAGTGGGTCATTTCTGTCACTCA 
AGAACTTGTTGATGCAAAGGGAGCCAATCT7GGTGTGCTTCGTTTGGATATTTCTTATGAAACTCTGGA 
AGCCTATCTCAATCAACTCCAGTTGGGGCAGCAGGGCTTTGCCTTCATTATCAATGAAAACCATGAATT 
TGTCTACCATCCTCAACACACAGTTTATAGTTCGTCTAGCAAAATGGAGGCTATGAAACCCTACATCGA 
T AC AGG TC AGGGTT AT AC T C CTGGTC AC AAATC CT ACGTC AGTC AAG AG AAG ATTGC AGG AACTG ATTG 
GACGGTGCTTGGCGTGTCATCATTGGAAAAGTTAGACCAGGTTCGGAGTCAG 

SP097 amino acid (SEQ ID NO: 170) 

YYQSSSSAIEATIEGNSQTTISQTSHFIQSYIKXLETTSTGLTQQTDVLAYAENPSQDKVEGIRDLFLT 
ILKSDKDLKTWLVTKSGQVISTDDSVQMKTSSDMMAEDWYQKAIHQGAMPVLTPARKSDSQWVISVTQ 
ELVDAKGANLGVLRLDISYETLEAYl^QLQLGQQGFAF^ 
TGQGYTPGHKSYVSQEKIAGTDWTVLGVS3LEKLDQVRSQ 

SP098 nucleotide <SEQ ID NO: 171) 

GACAAAAACATTAAAACGTCCTGAGGTTTTATCACCTGCAGGGACTTTAGAGAAGCTAAAGGTAGCTGT 
TCAGTATGGAGCAGATGCTGTCTTTATCGGTGGTCAGGCCTATGGTCTTCGTAGCCGTGCGGGAAACTT 
TACTTTCGAACAGATGGAAGAAGGCGTGCAGTTTGCGGCCAAGTATGGTGCCAAGGTCTATGTAGCGGC 
TAATATGGTTATGCACGAAGGAAATGAAGCTGGTGCTGGTGAGTGGTTCCGTAAACTGCGTGATATCGG 
GATTGCAGCAGTTATCGTATCTGACCCAGCCTTGATTATGATTGCAGTGACTGAAGCACCAGGCCTTGA 
AATCCACCTTTCTACCCAAGCCAGTGCCACTAACTATGAAACCCTTGAGTTCTGGAAAGAGCTAGGCTT 
GACTCGTGTCGTTTTAGCGCGTGAGGTTTCAATGGAAGAATTAGCTGAGATCCGCAAACGTACAGATGT 
TG AAATTG AAGC C TTTGTC C ATGG AGC T ATGTGT ATTTC AT AC T C TGG ACGTTG T ACTCTTTC AAAC C A 
C ATG AGT ATGCG TG ATGCC AAC C GTGGTGG ATGTTC TC AGT C ATGCC GTTGG AAAT AC G ACCTTT ACG A 
TATGCCATTTGGGAAAGAACGTAAGAGTTTGCAGGGTGAGATTCCAGAAGAATTTTCAATGTCAGCCGT 
TGACATGTCTATGATTGACCANATTCCAGATATGATTGAAAATGGTGTGGACAGTCTAAAAATCGAAGG 
ACGTATGNAGTCTATTCACTANGTATCAACAGTAACCAACTGCTACAAGGCGGCTGTGGATGCCTATCT 
TG AAAGTCC TG AAAAG TTTG AAGCT ATC AAAC AAG AC TTGGTGG AC G AG ATGTGG AAGGTTGCC C AAC G 
TGAACTGGCTACAGGATTTTACTATGGTACACCATCTGAAAATGAGCAGTTGTTTGGTGCTCGTCGTAA 
AATCCCTGAGTACAAGTTTGTCGCTGAAGTGGTTTCTTATGATGATGCGGCACAAACAGCAACTATTCG 
TCAACGAAACGTCATTAACGAAGGGGACCAAGTTGAGTTTTATGGTCCAGGTTTCCGTCATTTTGAAAC 
CT ATATTG AAG ATTTGC ATG ATGC T AAAGGC AAT AAAATCG AC C GC GC TC C AAATC CAATGGAACT ATT 
GACTATTAAAGTCCCACAACCTGTTCAATCAGGAGACATGGTTCGAGCTCTTAAAGAGGGGCTTATCAA 
TCTTTATAAGGAAGATGGAACCAGCGTCACAGTTCGTGCT 
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SP098 amino acid (SEQ ID NO:172) 

TKTLKRPE^LSPAGTLEKLKVAVQYGADAVFIGGQAYGLRSRAGNFTFEQMEEGVQFAAKYGAKVYVAA 
NMVMHEGNEAGAGEWFRXLRDIGIAAVIVSDPALIMIAVTEAPGuEIKL^ 

TRWLAREVSMEELAEI RKRTDVEI EAFVHGAMC I S Y SGRCTLSNHMSMRDANRGGCSQSCRWXYDLYD 
MPFGKERKSLQGEIPEEFSMSAVDMSMIDXIPDMIENGVDSLKIEGRMXSIHXVSTVTNCYKAAVDAYL 
ESPEKFEAIKQDLVDEMWKVAQRELATGFYYGTPSENEQLFGARRKIPEYKFVAEWSYDDAAQTATIR 
QRNVINEGDQVEFYGPGFRHFETYIEDLKDAKGNKIDRAPNPMELLTIKVPQPVQSGDMVRALKEGLIN 
LVKEDGTSVTVRA 

SP099 nucleotide ( SEQ ID NO: 173) 

TTCTCAGGAGACCTTTAAAAATATCACCAATAGCTTCTCCATGCAAATCAATCGTCGCGTCAACCAAGG 
AACGCCTCGTGGTGCTGGGAATATCAAGGGTGAAGACATCAAAAAAATCACCGAAAACAAGGCCATTGA 
GTCTT ATGT C AAACGT ATC AAC GCT ATCGG AG ATTTG AC TGG AT ATG ACCTG ATTG AAACGC C AG AAAC 
CAAGAAGAATCTCACTGCTGATCGTGCCAAGCGTTTTGGAAGTAGCTTGATGATTACAGGTGTCAATGA 
CTCCTCTAAAGAAGACAAGTTTGTCTCTGGTTCTTATAAACTAGTCGAAGGAGAGCACTTAACCAACGA 
CG AC AAGG AT AAAATC C TCTTGC AC AAGG ACTTGGC AGC C AAAC AC GGC TGGAAAGT AGGGG AC AAGGT 
TAAACTGGACTCTAATATCTACGATGCAGATAATGAAAAAGGAGCCAAGGAAACAGTTGAAGTGACAAT 
CAAGGGACTCTTTGATGGTCATAATAAGTCAGCAGTAACCTACTCACAAGAACTTTACGAAAACACAGC 
TATTACAGACATTCACACTGCTGCAAAACTTTATGGATACACAGAAGACACAGCCATTTATGGGGACGC 
AAC CTTCTTTG T AAC AGC AG AC AAG AACTTGG ATG ATGTT ATG AAAG AGTTG AATGGC ATC AGTGG T AT 
CAACTGGAAGAGCTACACACTCGTCAAGAGCTCCTCTAACTACCCAGCTCTTGAGCAATCTATCTCTGG 
TATGTACAAGATGGCCAAC 

SP099 amino acid (SEQ ID NO:174) 

sqetfknitnsfsmqjnrrvn^tprgagnikgedikkitenkaiesyvkrinaigdltgydlietpet 
kknltadrakrfgsslmitgvndsskedkfvsgsyklvegehlt>iddkdkillhkdlaakhgwkvgdkv 
kldsniydadnexgaketvevtixglfdghnksavtysqelyentaitdihtaaklygytedtaiygoa 

TFFVTADKKLDDVMKELNGISGINWKSYTLVKSSS^7YPALEQSISGMYKMAN 
SP100 nucleotide (SEQ ID NO:175) 

AGTAAATGCGCAATCAAATTCATTAATATTAATAGATGAACCTGAAATCTCACTTCATCCGAGTGCAAT 

CTATAAATTTAAAGAGTTTTTACTTCAAGAGTGTTTAAATAAAAAACATCAAATTATTATCA 

TTCTACACAACTTATAAAAGATTTTCCTAGAGAAGCCGTGAAACTTTTAGTGAAAAACGGAGAAAAGGT 

AGATGTTATTGAAAATATTGATTATCAGGATGCATTTTTTGAATTAGGTGATGTGTATCATTC 

GATGATTTATGTTGAAGATAGACTAGCTAAATATATTCTAGAGTTTGTTATCACTCATTCAGGTAGTGA 

GPu\TCTTAAACAGPlATTTAGTAGTGAGATATATTCCTGGTGGAGCAAATCAAATAATTTGTAATAATAT 

TTTAAACTCATCGTATTTAGATTCCGATAACCATTATTTTTGGCTTGATGGAGATCAAAACACTAATGT 

TAGTGAATCAAATAATTTAATGAACTATCTTGAAAATGGTGTTGTTATATCAGATAAAATTCCTGAATC 

AGATAATAAAAATCTTGATGATATTATAAAATTGATAANGGGATGTCCAATTAAATTTAATGTTTCAGG 

TAATAAAGGGCAAAAAAATAAT ATTG AATTAATTGCG AAAC AAAG AAGCTTT AT AG ATT ATTGGGCTAA 

ATAC 

SP100 amino acid (SEQ ID NO:176) 

VNAQSNSLILIDEPEISLHPSAIYXFKEFLLQECLNKKHQIIITTHSTQLIXDFPREAVKLLVKNGEKV 

dvienidyodaffelgdvyhsrkmiyvedrlaxyilefvithsgsenlkonlwryipgganqiicnni 

lnssyldsdnhyfwldgdqotnvsesnn^ 

nxgqxnn iiliaxqrsfi dywaxy 

SP101 nucleotide (SEQ ID NO:177) 

TTACCGCGTTCATCAAGATGTCAAACAAGTCATGACCTATCAACCCATGGTGCGAGAAATATTGAGTGA 
ACAAGACACCCCAGCAAACGAAGAGCTTGTGCTTGCTATGATTTATACTGAAACAAAAGGAAAAGAAGG 
CGATGTTATGCAGTCTAGTGAGTCTGCAAGTGGTTCCACCAACACCATCAATGATAATGCCTCTAGCAT 
TCGGCAAGGCATTCAAACTCTGACAGGCAATCTCTATCTGGCGCAGAAGAAGGGGGTAGATATCTGGAC 

AGC TGTTC AAGC C T AT AATTTTGG AC CTGC C T AT ATCG ATTTT ATCGC C C AAAATGGC AAGG AAAAT AC 
CCTGGCTCTAGCC AAAC AGTACTCTCGTGAGACTGTTGCCCCCTTGCTTGGTAATAGGACTGG AAAG AC 
TTATAGTTATATTCACCCCATTTCCATTTTTCACGGTGCTGAACTCTATGTAAATGGAGGAAACTATTA 
TTATTCTAGACAGGTACGACTTAACCTTTACATCATCAAATGTTTCACTCTCTTTTCAACATCTGGC 
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SP101 amino acid (SEQ ID NO: 17 8) 

YRVHQDVKQVMTYQPMVREILSEQOTPANEELVLAMIYTETKGKEGDVMQSSESASGSTNTINDNAS3I 

RQGIQTLTGNLYLAQKKGVDr^AVQAYNF^^ 
YSYIHPISIFHGAELYVNGGNYYYSRQVRLNLYIIKCFTLFSTSG 

SP102 nucleotide (SEQ ID NO:179) 

GTGGATGGGCTTTAACTATCTTCGTATTCGCCGTGCGGCTAAAATTGTGGACAATGAGGAGTTTGAAGC 
CTTGATTCGTACGGGTCAATTGATTGATTTGCGCGACCCAGCAGAATTCCACAGAAAACATATCCTTGG 
TGCACGCAATATTCCTTCAAGTCAGTTGAAAACTAGTCTTGCAGCCCTTCGTAAAGATAAACCTGTCCT 
TCTCTACGAAAACCAACGTGCGCAACGAGTTACAAATGCAGCTCTTTACTTGAAAAAACAAGGTTTTTC 
TGAGATTTATATCCTTTCTTATGGCTTGGATTCTTGGAAAGGGAAAGTGAAGACTAGC 

SP102 amino acid (SEQ ID NO: 180) 

WKGFOTLRIRRAAKIVDNEEFEALIRTGQLIDLRDPAEFHRKHILGARNIPSSQLKTSL.AALRKDKPVL 
LYENQRAQRVTNAALYLKKQGFSEIYILSYGLDSWKGKVKTS 

SP103 nucleotide ( SEQ ID NO: 181) 

ACTAAACCAGCATCGTTCGCAGGAAAATAAGGACAATAATCGTGTCTCTTATGTGGATGGCAGCCAGTC 
AAGTCAGAAAAGTGAAAACTTGACACCAGACCAGGTTAGCCAGAAAGAAGGAATTCAGGCTGAGCAAAT 
TGTAATCAAAATTACAGATCAGGGCTATGTAACGTCACACGGTGACCACTATCATTACTATAATGGGAA 
AGTTC C TT ATG ATGCC CTCTTT AG TG AAG AACTCTTG ATG AAGG ATC C AAACT ATC AAC TT AAAG ACGC 
TGATATTGTCAATGAAGTCAAGGGTGGTTATATCATCAAGGTCGATGGAAAATATTATGTCTACCTGAA 
AGATGCAGCTCATGCTGATAATGTTCGAACTAAAGATGAAATCAATCGTCAAAAACAAGAACATGTCAA 
AGATAATGAGAAGGTTAACTCTAATGTTGCTGTAGCAAGGTCTCAGGGACGATATACGACAAATGATGG 
TTATGTCTTTAATCCAGCTGATATTATCGAAGATACGGGTAATGCTTATATCGTTCCTCATGGAGGTCA 
CTATCACTACATTCCCAAAAGCGATTTATCTGCTAGTGAATTAGCAGCAGCTAAAGCACATCTGGCTGG 
AAAAAATATGCAACCGAGTCAG7TAAGCTATTCTTCAACAGCTAGTGACAATAACACGCAATCTGTAGC 
AAAAGGATCAACTAGCAAGCCAGCAAATAAATCTGAAAATCTCCAGAGTCTTTTGAAGGAACTCTATGA 
TTCACCTAGCGCCCAACGTTACAGTGAATCAGATGGCCTGGTCTTTGACCCTGCTAAGATTATCAGTCG 
T AC AC C AAATGG AGTTGCG ATTCCGC ATGG C G AC C ATT ACC ACTTT ATTC CTT AC AGC AAGCTTTC TGC 
CTTAGAAGAAAAGATTGCCAGAATGGTGCCTATCAGTGGAACTGGTTCTACAGTTTCTACAAATGCAAA 
ACCTAATGAAGTAGTGTCTAGTCTAGGCAGTCTTTCAAGCAATCCTTCTTCTTTAACGACAAGTAAGGA 
GCTCTCTTCAGCATCTGATGGTTATATTTTTAATCCAAAAGATATCGTTGAAGAAACGGCTACAGCTTA 
TATTGTAAGACATGGTGATCATTTCCATTACATTCCAAAATC.AAATCAAATTGGGCAACCGACTCTTCC 
AAACAATAGTCTAGCAACACCTTCTCCATCTCTTCCAATCAATCCAGGAACTTCACATGAGAAACATGA 
AGAAGATGGATACGGATTTGATGCTAATCGTATTATCGCTGAAGATGAATCAGGTTTTGTCATGAGTCA 

CGGAGACCACAATCATTATTTCTTCAAGAAG 
SP103 amino acid ( SEQ ID NO:182) 

LNQHRSQENKDNNRVS'fVDGSOSSQKSENLTPDQVSQKEGIOAEQIVIKITDQGWTSHGDHYHYYNGK 

VPYDALFSEELLMKDPNYQLKDADIVNEVKGGYIIKVDGKYYWLKDAAHADNVR^^ 

DNEKVNSWAVARSQGRYTTNDG'/VFNP>I3IIEDTGNAYIVPHGGKYHYIPKSDLSASELAAAKAHLAG 

KNMQPSQLSYSSTASDNNTQSVAKGSTSKPANKSENLQSLLKELYDSPSAORYSESDGLVFDPAKIISR 

TPNGVAIPHGDKYHFIPYSKLSALEEKIARMVPISGTGSTVSTNAKPNEWSSLGSLSSNPSSLTTSKE 

LSSASDGYIFNPKDIVEETATAYIVRHGDKFHYIPKSNQIGQPTLPNNSLATPSPSLPINPGTSHEKHE 

EDGYGFDANRI I AEDESGFVMSHGDHNHYFFKK 



SP10S nucleotide (SEQ ID NO: 183) 

TGACTACCTTGAAATCCCACTTTACAGCTATCTTGGTGGATTCAACACTAAAGTTCTTCCAACTCCAAT 
GATGAACATCATCAACGGTGGTTCTCACTCTGACGCTCCAATCGCTTTCCAAGAGTTCATGATCTTGCC 
AGTTGGTGCGCCAACATTTAAAGAAGCCCTTCGTTACGGTGCTGAAATCTTCCACGCTCTTAAGAAAAT 
CCTTAAATCACGTGGTTTGGAAACTGCCGTAGGTGACGAAGGTGGATTCGCTCCTCGTTTCGAAGGAAC 
TGAAGATGGTGTTGAAACTATCCTTGCTGCGATTGAAGCTGCTGGATATGTACCAGGTAAAGACGTATT 
TATCGGATTTGACTGTGCTTCATCAGAATTCTACGATAAAGAACGTAAAGTTTACGACTACACTAAATT 
TGAAGGTGAAGGTGCTGCTGTTCGTACATCTGCAGAACAAATCGACTACCTTGAAGAATTGGTTAACAA 
ATACCCAATCATCACTATTGAAGATGGTATGGATGAAAACGACTGGGATGGTTGGAAAGCTCTTACTGA 
ACGTCTTGGTAAGAAAGTACAACTTGTTGGTGACGACTTCTTCGTAACAAACACTGACTACCTTGCACG 
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TGGTATCCAAGAAGGTGCTGCTAACTCAATCCTTATCAAAGTTAACCAAATCGGTACTCTTACTGAAAC 
TTTTGAAGCTATCGAAATGGCTAAAGAAGCTGGTTACACTGCTGTTGTATCACACCGTTCAGGTGAAAC 
TGAAG ATTCAAC AATCGCTGATATTGC AGTTGC AACT AACGCAGGAC AAATC AAGAC TGGTTC ACTTTC 
ACGTACAGACCGCATCGCTAAATACAACCAATTGCTTCGTATCGAAGACCAACTTGGTGAAGTAGCTGA 
AT ATCGTGG ATTG AAATC ATTC T AC AAC CTT AAAAAA 

SP105 amino acid (SEQ ID NO:184) 

DYLEIPLYSYLGGFNTKVLPTPMMNIINGGSHSDAPIAFQEFMILPVGAPTrKEALRYGAEIFKALKKI 
LKSRGLETAVGDEGGFAPRFEGTEDGVETILAAI£AAGYVPGKDVFIGFDCASS£FYDK£RKVYDYTKF 
EGEGAAVRTSAEQIDYLEELVNKYPIITIEDGMDENDWDGWKALTERI^ 

GIQEGAANSILIKVNQIGTLTETFEAIEMAKEAGYTAWSHRSGETEDSTI.ADIAVATNAGQIKTGSLS 
RTDRIAKYNQLLRIEDQLGEVAEYRGLKSFYNLKK 

SP106 nucleotide (SEQ ID NO:185) 

TCGTATCTTTTTTTGGAGCAATGTTCGCGTAGAAGGACATTCCATGGATCCGACCCTAGCGGATGGCGA 
AATTCTCTTCGTTGTAAAACACCTTCCTATTGACCGTTTTGATATCGTGGTGGCCCATGAGGAAGATGG 
CAATAAGGACATCGTCAAGCGCGTGATTGGAATGCCTGGCGACACCATTCGTTACGAAAATGATAAACT 
CTACATCAATGACAAAGAAACGGACGAGCCTTATCTAGCAGACTATATCAAACGCTTCAAGGATGACAA 
ACTCCAAAGCACTTACTCAGGCAAGGGCTTTGAAGGAAATAAAGGAACTTTCTTTAGAAGTATCGCTCA 
AAAAGCTCAAGCCTTCACAGTTGATGTCAACTACAACACCAACTTTAGCTTTACTGTTCCAGAA.GGAGA 
ATACCTTCTCCTCGGAGATGACCGCTTGGTTTCGAGCGACAGCCGCCACGTAGGTACCTTCAAAGCAAA 
AGATATCACAGGGGAAGCTAAATTCCGCTTATGGCCAATCACCCGTATCGGAACATTT 

SF106 amino acid ( SEQ ID NO: 186) 

RIFFWSNVRVEGHSMDPTLADGEILrWKHLPIDRFDIWAHEEDGNKDIVK^VIGMPGDTIRYE^IDKL 
YINDKETDEPYLADYIKRFKDDKLQSTYSGKGFEGNKGTFFRSIAQKAQAFTVDVNYNTNrSFTVPEGE 
YLLLGDDRLV5SDSRHVGTFKAKDITGEAKFRLWPITRIGTF 

SP107 nucleotide (SEQ ID NO:187) 

GG ACTCTCTC AAAG ATG TG AAAG C AAATGC T AGCG AC AGC AAGC C TGC AC AGG AC AAG AAGG ATGC AAA 
ACAAGGAACGGAAGATAGTAAGGATTCAGATAAGATGACTGAAACAAACTCAGTTCCGGCAGGAGTGAT 
TGTGGTCAGTCTACTTGCCCTCCTAGGCGTGATTGCCTTCTGGCTGATTCGCCGTAAGAAAGAGTCAGA 
AATC C AGC AATT AAGC ACGGAA TTG ATC AAGG TTC T AGG AC AG C T AG ATGC AG AAAAAGC GG AT AAAAA 
AGTCCTTGCCAAAGCCCAAAACCTTCTCC.AAGAAACCCTTGATTTCGTGAA-AGAAGAAAATGGCTCAGC 
AGAGACAGAAACTAAACTAGTAGAGGAGCTTAAAGCAATCCTTGACAAACTCAAG 

SP107 amino acid { SEQ ID NO: 188) 

dslkdvkanasdskpaodkkdakqgtedskdsd:cmtetnsv?agviw5lla.llgviafwlirrkkese 
iqolstelikvlgqldaekadkkvlakaqnllqitldfvkeengsaetetklveelkaildklk 

SP108 nucleotide { SEQ ID NO: 189) 

C.AAGAAATCCTATCATCTCTTCCAGAAGCAAACAGAGACGAGGGGAATTCAGACTCAGTTGATTGAAGA 
ATCGCTT AG T C AGC AG AC T A T AAT C C AGTC C TTC AATG C T C AAAC AG AATTT ATC C AAAG ATTG C GTG A 
GGCTCATGACAACTACTCAGGCTATTCTCAGTCAGCCATCTTTTATTCTTCAACGGTCAATCCTTCGAC 
TC GCTTTGT AAATGC ACTC ATTT ATGC C C TTTT AGC TGG AGT AGG AGCTT ATCGT ATC ATG ATGGG TTC 
AGCCTTGACCGTCGGTCGTTTAGTGACTTTTTTGAACTATGTTCAGCAATACACCAAGCCCTTTAACGA 
T ATTTC TTC AGTGC T AGCTG AGTTGC AAAGTGCTCTGGCTTGCG TAG AGC GTATCTATGG AGTC TT AG A 
TAGCCCTGAAGTGGCTGAAACAGGTAAGGAAGTCTTGACGACCAGTGACCAAGTTAAGGGAGCTATTTC 
CTTT AAAC ATGTCTCTTTTGGCTACC ATC CTGAJ^AAAATTTTG ATT AAGG ACTTGTCTATCG AT ATTC C 
AGCTGGTAGTAAGGTAGCCATCGTTGGTCCGACAGGTGCTGGAAAATCAACTCTTATCAATCTCCTTAT 
GCGTTTTTATCCCATTAGCTCGGGAGATATCTTGCTGGATGGGCAATCCATTTATGATTATACACGAGT 
ATCATTGAGACAGCAGTTTGGTATGGTGCTTCAAGAAACCTGGCTCACACAAGGGACCATTCATGATAA 
TATTGCCTTTGGCAATCCTGAAGCCAGTCGAGAGCAAGTAATTGCTGCTGCCAAAGCAGCTAATGCAGA 
CTTTTTCATCCAACAGTTGCCACAGGGATACGATACCAAGTTGGAAAATGCTGGAGAATCTCTCTCTGT 
CGGCC AAGC TC AGC TCTTG AC CAT AGC CCGAGTCTTTCTGGCT ATTC C AAAG ATTC TTATCTT AG ACGA 
GGCAACTTCTTCCATTGATACACGGACAGAAGTGCTGGTACAGGATGCCTTTGCAAAACTCATGAAGGG 
C CGC AC AAGTTTC ATC ATTGCTC AC CGTTTGTCAJVCC ATTC AGG ATGC GG ATTT AATTCTTGTCTT AGT 
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AGATGGTGATATTGTTGAAT ATGGT AAC C ATC AAG AACTC ATGGAT AGAAAGGGTAAGTATT AC C AAAT 

GCAAAAAGCTGCGGCTTTTAGTTCTGA 

A 

SP108 amino acid ( SEQ ID NO: 190) 

KKSYHLFQKQTETRGIQTQLIEESLSQQTIIQSFNAQTEFIQRLREAHDNYSGYSQSAIFYSSTVNPST 
RFVNALIYALIAGVGAYRIMMGSALTVGRLVTFLNW 

SPEVAETGKEVLTTSDQVKGAISFKHVSFGYHPEKILIKDLSIDIPAGSKVAIVGPTGAGKSTLINLLM 
RFYPISSGDILLIX3QSIYDYTRVSLRQQFGMVLQETWLTQGTIHDNIAFGNPEASREQVIAAAKAANAD 
FFIQQLPQGYDTKLENAGESLSVGQAQLLTIARVFLAIPKILILDEATSSIDTRTEVLVQDAFAKLMKG 
RTS F 1 1 AHRL S T I QDADL I LVL VDGD I VE YGNHQ ELMDRKGXYYQMQKAAAF S S E 

SP109 nucleotide ( SEQ ID MO;191) 

ACGAAATGCAGGGCAGACAGATGCCTCGCAAATTGAAAAGGCGGCAGTTAGCCAAGGAGGAAAAGCAGT 
GAAAAAAACAGAAATTAGTAAAGACGCAGACTTGCACGAAATTTATCTAGCTGGAGGTTGTTTCTGGGG 
AGTGG AGG AAT ATTTCTC ACGTGTTC C C GG GGT G ACGG ATGCC GTTTC AGGC T ATGC AAATGG T AG AGG 
AGAAACAACCAAGTACGAATTGATTAACCAAACAGGTCATGCAGAAACCGTCCATGTCACCTATGATGC 
CAAGCAAATTTCTCTCAAGGAAATCCTGCTTCACTATTTCCGCATTATCAATCCAACCAGCAAAAATAA 
ACAAGGAAATGATGTGGGGACCCAGTACCGTACTGGTGTTTATTACACAGATGACAAGGATTTGGAAGT 
GATTAACCAAGTCTTTGATGAGGTGGCTAAGAAATACGATCAACCTCTAGCAGTTGAAAAGGAAAACTT 
GAAGAATTTTGTGGTGGCTGAGGATTACCATCAAGACTATCTCAAGAAAAATCCAAATGGCTACTGCCA 
TATCAATGTTAATCAGGCGGCCTATCCTGTCATTGATGCCAGCAAATATCCAAAACCAAGTGATGAGGA 
ATTGAAAAAGACCCTGTCACCTGAGGAGTATGCAGTTACCCAGGAAAATCAAACAGAACGAGCTTTCTC 
AAACCGTTACTGGGATAAATTTGAATCCGGTATCTATGTGGATATAGCAACTGGGGAACCTCTCTTTTC 
ATCAAAAGACAAATTTGAGTCTGGTTGTGGCTGGCCTAGTTTTACCCAACCCATCAGTCCAGATGTTGT 
CACCTACAAGGAAGATAAGTCCTACAATATGACGCGTATGGAAGTGCGGAGCCGAGTAGGAGATTCTCA 
CCTTGGGCATGTCTTTACGGATGGTCCACAGGACAAGGGCGGCTTACGTTACTGTATCAATAGCCTCTC 
T ATC C GC TTT ATTCC C AAAG AC C AAATGG AAG AAAAAGGCT AC GCTT ATTT AC T AG ATT ATGTTG AT 

SP109 amino acid (SEQ ID NO: 192) 

RNAGQTD A S Q I E KAA V S QGGKA VKKT E I S K2 AD L H E I Y LAGGC F WG V E E Y F S R V PG VT D A VSG Y ANG RG 
ETTKYELINQTGHAETVHVTYDAKQISLKEILLKYFRIINPTSKNKQGNDVGTQYRTGVYYTDDKDLEV 
INQVFDEVAXKYDQPLAVEKENLKNFWAEDYHQDYLKKNPNGYCHINVNQAAYPVIDASKYPKPSDEE 
LKKTLSPEEYAVTQENQTERAFSNRYWDKFESGIYVDIATGEPLFSSKDKFESGCGWPSFTQPISPDW 
TYKEDKSYIJMTRMEVRSRVGDSKLGHVFTDGPQDKGGLRYCINSLSIRFIPKDQMEEKGYAYLLDYVD 

SP110 nucleotide (SEQ ID NO: 193) 

TGTATAGTTTTTAGCGCTTGTTCTTCTAATTCTGNTAAAAATGAAGAAAATACTTCTAAAGAGCATGCG 
CCTGATAAAATAGTTTTAGATCATGCTTTCGGTCAAACTATATTAGATAAAAAACCTGAAAGAGTTGCA 
ACTATTGCTTGGGGAAATCATGATGTAGCATTAGCTTTAGGAATAGTTCCTGTTGGATTTTCAAAAGCA 
AATTACGGTGTAAGTGCTGATAAAGGAGTTTTACCATGGACAGAAGAAAAAATCAAAGAACTAAATGGT 
AAAGCTAACCTATTTGACGATTTGGATGGACTTAACTTTGAAGCAATATCAAATTCTAAACCAGATGTT 
ATCTTAGCAGGTTATTCTGGTATAACTAAAGAAGATTATGACACTCTATCA 

SP110 amino acid (SEQ ID NO: 194) 

CIVFSACSSNSXKNEENTSKEHAPDKIVLDHAFGQTILDKKPERVATIAWGNHDVALALGIVPVGFSKA 
NYGVSADKGVLPWTEEKIKELNGKANLFDDLDGLNFEAISNSKPDVILAGYSGITKEDYDTLS 

SP111 nucleotide (SEQ ID NO:195) 

GTGTGTCGAGCATATTCTGAAGCAAACCTATCAAAATATAGAAATTATTTTAGTTGATGACGGTTCTAC 
GGATAATTCTGGGGAAATTTGTGATGCTTTTATGATGCAAGATAATCGTGTGCGAGTATTGCATCAAGA 
AAATAAGGGGGGGGCAGCACAAGCTAAAAATATGGGGATTAGTGTAGCTAAGGGAGAGTACATCACGAT 
TGTTGATTC^GATGATATCGTAAAAGAAAATATGATTGAAACTCTTTATCAGCAAGTCCAAGAAAAGGA 
TGCAGATGTTGTTATAGGGAATTACTATAATTATGACGAAAGTGACGGGAATTTTTATTTTTATGTAAC 
AGGGCAAGATTT m TGCGTCGAAGAATTAGCTATACAAGAAATTATGAACCGTCAAGCAGGAGATTGGAA 
ATTCAATAGCTCGGCCTTTATATTGCCGACATTTAAGTTGATTAAAAAAGAATTATTCAATGAAGTTCA 
CTTT^CAAATGGTCGCCGCTTTGATGATGAAGCAACTATGCATCGCTTTTATCTTTTAGCCTCTAAAAT 
CG^C^TTATAAACGATAATCTCTATCTGTATAGAAGACGTTCAGGAAGCATCATGAGAACGGAATTTGA 
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TCTTTC C TGGGC AAG AG AT ATTGTTG AAGTG TTTTC T AAGAAAAT ATCGG ATTGTGTC TTGGC TGGTTT 
GGATGTCTCCGTTCTGCGTATTCGATTTGTCAATCTTTTAAAAGATTATAAGCAAACTTTAGAATACCA 
TCAATTAACAGATACTGAGGAATATAAAGATATTTGTTTCAGATTAAAGTTGTTTTTTGATGCAGAACA 
AAGAAATGGTAAAAGT 

SP111 amino acid (SEQ ID MO: 196 ) 

O/EHILKQTYQNIEIILVDDGSTDNSGEICDAFMMQDimVRVLHQEKKGGAAQAKNMGISVAKGEYITI 
VT)SDDIVKENMIETLYQQVQEKDADWIG^YN^ 

F^SSAFILPTFKLIKXELFNEVHFSNGRRFDDEATMHRFYLIASKIVFINDNLYLYRRRSGSIMRTEFD 
LSWARDIVEVFSKKISDCVLAGLDVSVLRIRFVNLLKDYKQTLEYHQLTDTEEYKDICFF.LKLFFDAEQ 
RNGKS 

SP0112 nucleotide (SEQ ZD NO:197) 

GTGTTTGGATAGCATTCAGAATCAGACGTATCAAAATTTTGAGTGTTTATTAATCAATGATGGCTCTCC 
AGATCATTCATCCAAAATATGTGAAGAATTTGTAGAGAAAGATTCTCGTTTCAAATATTTTGAGAAAGC 
AAACGGC GGTC TTTC ATC AGCTC G T AACCT AGGT ATTG AATGTTCGGGGGGGGCGT AC ATT ACTTTTGT 
AGACTCTGATGATTGGTTGGAACATGATGCTTTAGACCGATTATATGGTGCTTTGAAAAAGGPlAAACGC 
AGATATTAGTATCGGGCGTTATAATTCTTATGATGAAACACGCTATGTGTATATGACTTATGTTACGGA 
TCCAGATGATTCTCTAGAAGTGATAGAAGGTAAAGCAATTATGGATAGGGAAGGTGTCGAAGAAGTCAG 
AAATGGGAACTGGACTGTAGCTGTCTTGAAGTTATTCAAGAGAGAGTTACTACAAGATTTACCATTTCC 
TATAGGAAAAATTGCAGAGGATACTTACTGGACATGGAAGGTACTTCTAAGAGCTTCGAGGATAGTCTA 
TTTGAATCGTTGTGTTTACTGGTACCGTGTTGGTTTATCTGATACTTTATCGAATACATGGAGTGAAAA 
GCGTATGTATGATGAAATTGGGGCTAGGGAAGAAAAGATAGCTATTTTAGCAAGTTCAGACTATGACTT 
GACCAATCATATTTTGATTTATAAA.^TAGATTACAAAGAGTGATAGC^J^AATTAGAAGAAC^AAATAT 
GC AGTTC AC AG AG ATJTT AC AG AAG AATG ATGG AAAAATTG TC TTT ACTTC C G 

SP0112 amino acid (SEQ ID NO:198) 

CLDS IQNQTYQNFECLLINDGS PDHSSKICEEFVEKDSRFKYFEKANGGL3 SARNLGI ECSGGAYITFV 
DSDDWLEHDALDRLYGALKKENADI 3 IGRYN3 YDETRYVYMTYVTDPDDSLEVI EGKAIMDREGVEEVR 
NGNVm/AVLKLFKRELLQDLPFPIGKIAEDTYWTO^ 

RMYDEIGAREEKIAILASSDYDLTNHILIYKNRLQRVIAKLEEQNMQFTEIYRRMMEKLSLL? 
SP113 nucleotide { SEQ ID NO: 199) 

GTGCCTAGATAGTATTATTACTCAAACATATAAAAATATTGAGATTGTTGTCGTTAATGATGGTTCTAC 
GG ATGCTT C AGGTG AAATTTGT AAAG AATTTTC AG AAATGGATC AC C G AATTC TC T AT AT AGAAC AAG A 
AAATGCTGGTCTTTCTGCCGCACGAAACACCGGTCTGAATAATATGTCCGGAAATTATGTGACCTTTGT 
GGACTCGGATGATTGGATTGAGCAAGATTATGTAGAAACTCTATATAAAAAAATAGTAGAGTATCAGGC 
TGATATTGC AGTTGGT AATT ATT ATTCTTTC AACG AAAGTG AAGGAATGTTC T ACTTTC AT AT AT TGGG 
AGACTCCTATTATGAGAAAGTATATGATAATGTTTCTATCTTTGAGAACTTGTATGAAACTCAAGAAAT 
G AAG AGTTTTGCTTTG AT ATCTGC TTGGGGT AAAC TCT AT AAGGC AAG ATTGTTTG AGC AGTTGCGCTT 
TGACATAGGTAAATTAGGAGAAGATGGTTACCTCAATCAAAAGGTATATTTATTATCAGAAAAGGTAAT 
TTATTTAAATAAAAGTCTTTATGCTTATCGGATTAGAAAAGGTAGTTTA.TCAAGAGTTTGGACAGAAAA 
G TGG ATGC AC GCTTT AGTTG ATGCT ATGTCTG AACGT ATT AC GC TACT AGC T AAT ATGGGTT ATC CTC T 
AGAGAAACACTTGGCAGTTTATCGTCAGATGTTGGAAGTCAGTCTCGCCAACGGTCAAGCTAGTGGTTT 
ATC TG AC AC AGC AACGT AT AAAG AGTTTGAAATGAAAC AAAGGCTTTT AAATC AGC T ATCG AG AC AAG A 
GG AAAGTG AAAAGAAAGCCATTGTCCTCGCAGCAAACTATGGCTATGTAGACCAAGTTTTAACGACAAT 
CAAGTCTATTTGTTATCATAATCGTTCGATTCGTTTTTATCTGATTCATAGCGATTTTCCAAATGAATG 
G ATT AAGC AATT AAAT AAGC GCTT AG AG AAGTTTG AC TC AG AAATT ATT AATTGTCGGGT AACT T CTG A 
GCAAATTTCATGTTATAAATCGGATATTAGTTACACAGTCTTTTTACGCTATTTCATAGCTGATTTCGT 
GC AAG AAG AC AAGGC CCTCTACTTGGACTGTGATCTAGTTGT AACG AAAAATCTGGATG AC TTGTTTGC 
TACAGACTTACAAGATTATCCTTTGGCTGCTGTTAGAGATTTTGGGGGCAGAGCTTATTTTGGTCAAGA 
AATCTrTAATGCCGGTGTTCTCTTGGTAAACAATGCTTTTTGGAAAAAAGAGAATATGACCC 
AATTGATGTAACCAATGAATGGCATGATAAGGTGGATCAGGCAGATCAGAGCATCTTGAATATGCTTTT 
TGAACATAAATGGTTGGAATTGGACTTTGATTATAATCATATTGTCATTCATAAACAGTTTGCTGATTA 
TCAATTGCCTGAGGGTCAGGATTATCCTGCTATTATTCACTATCTTTCTCATCGGAAACCGTGGAAAGA 
TTTGGC GGC C C AAAC CT ATCGTG AAGTTTGGTGGT ACT ATC ATGGGCTTG AATGG AC AGAATTGGG AC A 
AAAC CATC ATTT AC ATC C ATT AC AAAG ATCTC AC ATCT ATCC AAT AAAGG AAC CTTTC ACTTGTCT AAT 
CT AT AC TGC CTC AG ACC AT ATTG AAC AAATTG AGAC ATTGGTTC AATC CTTGCCTG AT ATTC AGTTT AA 
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GATAGCAGCTAGAGTAATAGTTAGTGATCGATTGGCTCAGATGACAATTTATCCAAACGTGACTATATT 
TAACGGAATTCACTATTTGGTAGATGTCGATAATGAATTGGTAGAAACCAGTCAAGTACTTTTAGATAT 
TAATCATGGCGAAAAGACAGAAGAAATTCTCGATCAATTTGCTAATCTTGGCAAGCCTA7CTTATCCTT 
TGAAAATACTAAAACCTATGAAGTAGGTCAGGAGGCATATGCTGTTGACCAAGTTCAAGCAATGATTGA 
AAAATTGAGAGAAATAAGCAAA 

SP113 amino acid (SEQ ID NO: 200) 

CLDSIITQTYKNIEIWVNDGSTDASGEICKEFSEMDHRI 

DSDDWIEQDYVETLYKKIVEYQADIAVGrTCYSFNESEGMFYFHILGDSre^ 

K5FALISAWGKLYKARLFEQLRFDIGKLGEDGYLNQK\TYLLSEKVIYLNK^ 

WMHALVDAMSERITLIANMGYPLEKK1AVYRQMLEVSLANGQASGLSDTATYKEFEMKQR 

ESEKKA I VLAANYGYVDQVLTT I KS I CYHNRS I RFYL IHSDF PNEWI KQ LNKRLEKFDSZIINCRVTSE 

QISCYXSDISYTVFLRYFIADFVQEDKALYLDCDLWTKN^^ 

IFNAGVLLVNNAFWKKENMTQKLIDVTNEWHDK^ 

QLPEGQDYPAIIKYLSKRKPWKDLAAQTYREVWWYYHGLEWTELGQNHHLKPLQRSHIYPIKEPFTCLI 
YTASDHIEQIETLVQSLPDIQFKIAARVIVSDRLAQMTIYPNVTIFNGIHYLVDTONELVETSQVLLDI 
NHGEKTEEILDQFANLGXPILSFENTKTYZVGQEAYAVDQVQAMIEKLREISK 

SP114 nucleotide (SEQ ID NO:201) 

CATTCAGAAGCAGACCTATCAAAATCTGGAAATTATTCTTGTTGATGATGGTGCAACAGATGAAAGTGG 
TCGCTTGTGTGATTCAATCGCTGAACAAGATGACAGGGTGTCAGTGCTTCATAAAAAGAACGAAGGATT 
GTCGCAAGCACGAAATGATGGGATGAAGCAGGCTCACGGGGATTATCTGATTTTTATTGACTCAGATGA 
TTATATCCATCCAGAAATGATTCAGAGCTTATATGAGCAATTAGTTCAAGAAGATGCGGATGTTTCGAG 
CTGTGGTGTCATGAATGTCTATGCTAATGATGAAAGCCCACAGTCAGCCAATCAGGATGACTATTTTGT 
CTGTGATTCTCAAACATTTCTAAAGGAATACCTCATAGGTGAAAAAATACCTGGGACGATTTGCAATAA 
GCTAATCAAGAGACAGATTGCAACTGCCCTATCCTTTCCTAAGGGGTTGATTTACGAAGATGCCTATTA 
CCATTTTGATTTAATCAAGTTGGCCAAGAAGTATGTGGTTAATACTAAACCCTATTATTACTATTTCCA 
TAGAGGGGATAGTATTACGACCAAACCCTATGCAGAGAAGGATTTAGCCTATATTGATATCTACCAAAA 
GTTTTATAATGAAGTTGTGAAAAACTATCCTGACTTGAAAGAGGTCGCTTTTTTCAGATTGGCCTATGC 
CC AC TTC T TT ATT C TGG AT AAG ATGTTGC T AG ATG ATC AGT AT AAAC AGTTTG AAG C C T ATTC TC AG AT 
TCATCGTTTTTTAAAAGGCCATGCCTTTGCTATTTCTAGGAATCCAATTTTCCGTAAGGGGAGAAGAAT 
TAGTGCTTTGGCCCTATTCATAAATATTTCCTTATATCGATTCTTATTACTGAAAAATATTGAAAAATC 

T AAAAAATT AC AT 

SP114 amino acid (SEQ ID NO:202) 

IQKQTYQNLEIIIVDDGATDESGRLCDSIAEQDDRVSVLHKKNEGLSQARNDGMKQAKGZYLIFIDSDD 
YIHPEMIQSLYEQLVQEDADVSSCGVMNVY ANDES PQSANQDDYFVCDSQTFLKEYLIGEKIPGTICNK 
LIKRQIATALSFPKGLIYEDAYYHFDLIKLAKKYVVNTKPYYYYFHRGDSITTKPYAEKDLAYIDIYQK 
FYNEVVKNYPDLKEVAFFRLAYAHFFILDKMLLDDQYKQFEAYSQIHRFLKGHAFAISRNPIFRKGRRI 
SALALF INI 3 L YRFLLLKNI EKSKKLH 

SP115 nucleotide (SEQ ID MO:203) 

TAAGGCTGATAATCGTGTTCAAATGAGAACGACGATTAATAATGAATCGCCATTGTTGCTTTCTCCGTT 
GTATGGCAATGATAATGGTAACGGATTATGGTGGGGGAACACATTGAAGGGAGCATGGGAAGCTATTCC 
TGAAGATGTAAAGCCATATGCAGCGATTGAACTTCATCCTGCAAAAGTCTGTAAACCAACAAGTTGTAT 
TCCACGAGATACGAAAGAATTGAGAGAATGGTATGTCAAGATGTTGGAGGAAGCTCAAAGTCTAAACAT 
TCCAGTTTTCTTGGTTATTATGTCGGCTGGAGAGCGTAATACAGTTCCTCCAGAGTGGTTAGATGAACA 
ATTC C AAAAGT AT AGTGTG TT AAAAGGTGTTTT AAAT ATTG AG AATT ATTGG ATTT AC AAT AAC C AGTT 
AGCTCCGCATAGTGCTAAATATTTGGAAGTTTGTGCCAAATATGGAGCGCATTTTATCTGGCATGATCA 
TGAAAAATGGTTCTGGGAAACTATTATGAATGATCCGACATTCTTTGAAGCGAGTCAAAAATATCATAA 
AAATTTGGTGTTGGCAACTAAAAATACGCCAATAAGAGATGATGCGGGTACAGATTCTATCGTTAGTGG 
ATTTTGGTTGAGTGGCTTATGTGATAACTGGGGCTCATCAACAGATACATGGAAATGGTGGGAAAAACA 
TTATACAAACACATTTGAAACTGGAAGAGCTAGGGATATGAGATCCTATGCATCGGAACCAGAATCAAT 
GATTGCTATGGAAATGATGAATGTATATACTGGGGGAGGCACAGTTTATAATTTCGAATGTGCCGCGTA 

T AC A TTT ATG AC AAATG ATG T AC C AACTCC AGC ATTT AC TAAAGGT ATT ATTC C TTTC TTT AG AC ATGC 
TATACAAAATCCAGCTCCAAGTAAGGAAGAAGTTGTAAATAGAACAAAAGCTGTATTTTGGAATGGAGA 
AGGT AGG ATT AG TTC ATT AAAC GG A TTTT A TC AAGG ACTTT ATTCG AATG ATG AAAC AATGC CTTT AT A 
TAATAATGGGAGATATCATATTCTTCCTGTAATACATGAGAAAATTGATAAGGAAAAGATTTCATCTAT 
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ATTCCCTAATGCAAAAATTTTGACTAAAAATAGTGAGGAATTGTCTAGTAAAGTCAACTATTTAAACTC 

GCTTTATCCAAAACTTTATGAAGGAGATGGGTATGCTCAGCGTGTAGGTAATTCCTGGTATATTTATAA 

TAGTAATGCTAATATCAATAAAAATCAGCAAGTAATGTTGCCTATGTATACTAATAATACAAAGTCGTT 

ATCGTTAGATTTGACGCCACATACTTACGCTGTTGTTAAAGAAAATCCAAATAATTTACATATTTT 

GAATAATTACAGGACAGATAAGACAGCTATGTGGGCATTATCAGGAAATTTTGATGCATCAAAAAGTTG 

GAAGAAAGAAGAATTAGAGTTAGCGAACTGGATAAGCAAAAATTATTCCATCAATCCTGTAGATAATGA 

CTTTAGGACAACAACACTTACATTAAAAGGGCATACTGGTCATAAACCTCAGATAAATATAAGTGGCGA 

TAAAAATCATTATACTTATACAGAAAATTGGGATGAGAATACCCATGTTTATACCATTACGGTTAATCA 

TAATGGAATGGTAGAGATGTCTATAAATACTGAGGGGACAGGTCCAGTCTCTTTCCCAACACCAGATAA 

ATTTAATGATGGTAATTTGAATATAGCATATGCAAAACC.AACAACACAAAGTTCTGTAGATTACAATGG 

AGACCCTAATAGAGCTGTGGATGGTAACAGAAATGGTAATTTTAACTCTGGTTCGGTAACACACACTAG 

GGCAGATAATCCCTCTTGGTGGGAAGTCGATTTGAAAAAAATGGATAAAGTTGGGCTTGTTAAAATTTA 

TAATCGCACAGATGCTGAGACTCAACGTCTATCTAATTTT 

SP115 amino acid (SEQ ID NO: 204) 

KADOTVQMRTTINNESPLLLS?LYGNDNGNGLWWGNTLXGAWEAI?EDVKPYAAI£LHPAKVCKPTSCI 
PRDTKELREWYVKMLEEAQSLNIPVFLVIMSAGERNTVPPEWL^ 
APHSAKYLE^CAKYGAHFIWHDHEXWFWETIMNDPTFFE^SQ 
r^/tfLSGLCDNWGSSTDTWKVWEKKYTNTFZTGRARDM 

TFMTNDVPTPAFTKGIIPFFRHAIQNPAPSKEEVVNRTKAVFWNGEGRISSLNGFYOGLYSNDETMPLY 

nngryhilpvikik:dkekissif?nax::tkn^^ 
3naninknqqvml?htytnntkslsldlt?htya\a^enp^ 
kkeelelawisknysinpmvdfrttt^tlkghtghkpqinisgdkntiytytenwdent 
ngmvemsintegtgpvsf ptpdkfnix5nlniayakpttqssvdyngdpnravdgnrngnfnsg3vthtr 
adnpswwevdlkkmdkvglvki ynrtdaetqrl3nf 

SP117 nucleotide (SEQ ID NO:205) 

CTGTGGCAATCAGTCAGCTGCTTCCAAACAGTCAGCTTCAGGAACGATTGAGGTGATTTCACGAGAAAA 
TGGCTCTGGGACACGGGGTGCCTTCACAGAAATCACAGGGATTCTCAAAAAAGACGGTGATAAAAAAAT 
TGACAACACTGCCAAAACAGCTGTGATTCAAAATAGTACAGAAGGTGTTCTCTCAGCAGTTCAAGGGAA 
TGCTAATGCTATCGGCTACATCTCCTTGGGATCTTTAACGAAATCTGTCAAGGCTTTAGAGATTGATGG 
TGTCAAGGCTAGTCGAGACACAGTTTTAGATGGTGAATACCCTCTTCAACGTCCCTTCAACATTGTTTG 
GTCTTCTAATCTTTCCAAGCTAGGTCAAGATTTTATCAGCTTTATCCACTCCAAACAAGGTCAACAAGT 
GGTCACAGATAATAAATTTATTGAAGCTAAAACCGAAACCACGGAATATACAAGCCAACACTTATCAGG 
CAAGTTGTCTGTTGTAGGTTCCACTTCAGTATCTTCTTTAATGGAAAAATTAGCAGAAGCTTATAAAAA 
AGAAAATCCAGAAGTTACGATTGATA.TTACCTCTAATGGGTCTTCAGCAGGTATTACCGCTGTTAAGGA 
GAAAACCGCTGATATTGGTATGGTTTCTAGGGAATTAACTCCTGAAGAAGGTAAGAGTCTCACCCATGA 
TGCTATTGCTTTAGACGGTATTGCTGTTGTGGTCAATAATGACAATAAGGCAAGCCAAGTCAGTATGGC 
TG AAC TTGC AG ACGTTTTTAGTGGC AAATT AAC C ACC TGGG AC AAG ATT AAA 

SP117 amino acid (SEQ ID NO:206) 

CGNQSAASKQSA3GTIEVISRENGSGTRGAFTEITGILKXDGDKKIDNTAKTAVIQNSTEGVLSAVQGN 
ANAIGYISLGSLTKSVKALEIDGVKASRDTVLDGEYPLQRPFNIVWSSNLSKLGQDFISr IHSKQGQQV 
VTDNKFIEAKTETTEYTSQHLSGKLSWGSTSVS3LMEKLAEAYXKENPEVTIDITSNGSSAGITAVKE 
KTADIGMVSRELTPEEGKSLTHDAIALDGIAWVNNDNKASQVSMAELADVFSGKLTTWDKIK 

SP118 nucleotide (SEQ ID NO:207) 

TTG TC AAC AAC AAC ATGC T ACTTCTG AGGGG ACG AATC AAAGGC AAAGC AG TT C AG CG AAAGTTCC ATG 
GAAAGCTTCATACACCAACCTAAACAACCAGGTAAGTACAGAAGAGGTCAAATCTCTCTTATCAGCTCA 
CTTGG ATC C AAAT AGTGTTG ATGC ATTTTTT AATC TC G TT AATG AC T AT AAT AC C ATTGTC GGCTC AAC 
TGGCTTATCAGGAGATTTCACTTCCTTTACTCACACCGAATACGATGTTGAGAAAATCAGTCATCTCTG 
GAATCAAAAGAAGGGCGATTTTGTTGGGACCAACTGCCGTATCAATAGTTATTGTCTTTTGAAAAATTC 
AGTCACCATTCCAAAGCTTGAAAAGAATGACCAGTTGCTTTTCCTAGATAATGATGCGATTGATAAAG^ 
AAAGGTCTTTGATTCACAAGATAAGGAAGAGTTTGATATTCTATTTTCGAGAGTTCCAACTGAGTCAAC 
TACAGATGTCAAGGTTCACGCTGAAAAGATGGAAGCATTCTTCTCACAATTTCAATTCAATGAAAAAGC 
TCGAATGCTGTCTGTAGTCTTGCACGACAATTTGGATGGCGAGTATCTGTTTGTAGGCCACGTTGGGGT 
CTTAGTACCTGCTGATGACGGTTTCTTATTTGTAGAGAAATTGACTTTCGAAGAGCCCTACCAAGCGAT 
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T AAATTTGC T AGT AAGG AAG ATTGCT AC AAGT ATTTGGGC AC C AAG T ATGC G G ATT AT AC AGGC G AGGG 
ACTGGCT AAGC C TTTT AT C ATGG AT AATG AT AAGTGGGTT AAAC TT 

SP118 amino acid ( SEQ ID NO:208) 

CQQQHATSEGTNQRQSSSAKVPWKASYTNLNNQVSTEEVK5LLSAHLDPNSVDAFFNLVNDYNTIVG3T 
GLSGDFTSFTHTEYDVEKISKLWNQKKGDFVGTNCRINSYCLLKNSVTIPKLEKNDQLLr LDNDAIDKG 
KVFDSQDKEEFDILFSRVPTESTTDVKVHAEKMEAFFSQFQFNEKARMLSVV^ 
LVPADDGFLFVEKLTFEEPYQAIKFASKEDCYKYLGTKYADYTGEGLAXPFIMDNDKWVKL 

SP119 nucleotide (SEQ ID NO: 209) 

TTGTTC AGGC AAGTC CGTG AC T AGTG AAC AC C AAACG AAAG ATG AAATG AAG AC GGAGC AG AC AGCT AG 
TAAAACAAGCGCAGCTAAAGGGAAAGAGGTGGCTGATTTTGAATTGATGGGAGTAGATGGCAAGACCTA 
CCGTTTATCTGATTACAAGGGCAAGAAAGTCTATCTCAAATTCTGGGCTTCTTGGTGTTCCATCTGTCT 
GGCTAGTCTTCCAGATACGGATGAGATTGCTAAAGAAGCTGGTGATGACTATGTGGTCTTGACAGTAGT 
GTCACCAGGACATAAGGGAGAGCAATCTGAAGCGGACTTTAAGAATTGGTATAAGGGATTGGATTATAA 
AAATCTC C CAGTCCTAGTTG AC C CATC AGGC AAAC TTTTGGAAACTTATGGTGTCCGTTCTT AC CC AAC 
CCAAGCCTTTATAGACAAAGAAGGCAAGCTGGTCAAAACACATCCAGGATTCATGGAAAAAGATGCAAT 

TTTGC AAAC TTTG AAGG AATT AG CC 

SP119 amino acid (SEQ ID NO: 210) 

csgksvtsehqtkdemkteqtasktsaakgkevadfelmgvix;ktyrlsdyxgkk^lkfwaswcs:cl 

ASLPDTDEIAKEAGDDYWLTWSPGHKGEQSEADFKNWYKGLDYKNLPVLVDPSGKLLETYGVRSY?T 
QAFIDKEGKLVKTHPGFMEKDAILQTLKELA 

SP120 nucleotide (SEQ ID NO:211) 

CTCGCAAATTGAAAAGGCGGCAGTTAGCCAAGGAGGAAAAGCAGTGAAAAAAACAGAAATTAGTAAAGA 
CGCAGACTTGCACGAAATTTATCTAGCTGGAGGTTGTTTCTGGGGAGTGGAGGAATATTTCTCACGTGT 
TCCCGGGGTG AC GG ATGC C GTTTC AGGC T ATGC AAATGGT AG AGG AG AAAC AAC C AAGT ACG AATTG AT 
TAACCAAACAGGTCATGCAGAAACCGTCCATGTCACCTATGATGCCAAGCAAATTTCTCTCAAGGAAAT 
CCTGCTTCACTATTTCCGCATTATCAATCCAACCAGCAAAAATAAACAAGGAAATGATGTGGGGACCCA 
GT ACCGT AC TGG TGTTT ATT AC AC AG ATG AC AAGG ATTTGG AAGT G ATT AAC C AAG TC TTTG ATG AGGT 
GGCT AAG AAAT AC G ATC AAC C TCT AGC AGTTG AAAAGG AAAAC TTG AAG AA TTTTGTGG TGG C TG AGG A 
TTACCATCAAGACTATCTCAAGAAAAATCCAAATGGCTACTGCCATATCAATGTTAATCAGGCGGCCTA 
TCCTGTCATTGATGCCAGCAAATATCCAAAACCAAGTGATGAGG AATTG AAAAAGACCCTGTCACCTGA 
GGAGTATGCAGTTACCCAGGAAAATCAAACAGAACGAGCTTTCTCAAACCGTTACTGGGATAAATTTGA 
ATCCGGTATCTATGTGGATATAGCAACTGGGGAACCTCTCTTTTCATCAAAAGACAAATTTGAGTCTGG 
^TGTGGCTGGCCTAGTTTTACCCAACCCATCAGTCCAGATGTTGTCACCTACAAGGAAGATAAGTCCTA 
CAATATGACGCGTATGGAAGTGCGGAGCCGAGTAGGAGATTCTCACCTTGGGCATGTCTTTACGGATGG 
TCCACAGGACAAGGGCGGCTTACGTTACTGTATCAATAGCCTCTCTATCCGCTTTATTCCCAAAGACCA 

AATGGAAGAAAAAGGTACGCTTATTTAC 



SP120 amino acid (SEQ ID NO:212) 

SQIEKAAVSQGGKAWKTEISKDADLKEIYIAGGCFWGVEEYFSRVPGVTDAVSGYANGRGETTKYELI 
NQTGHAETVHVTYDAKQI SLKEI LLKYFRI INPTSKNKQG^^DVGTQYRTGVYYTDDKDLEVINQVFDEV 
AKKYDQPIAVEKENLKNFVVAEDYHQDYLKKNPNGYCHI^IVNQAAYPVIDASKYPKPSDEEL^CKTLSPE 
EYAVTQENQTERAFSNRYWDKFESGI*m)IATGEPLFSSKDKFESGCGWPSFTQPISPDWTYKEDKSY 
NMTRMEVRSRVGDSHLGHVFTDGPQDKGGLRYC INSLSIRF I PKDQMEEKGTLI Y 

SP121 nucleotide (SEQ ID NO:213) 

TTGTCAGTCAGGTTCTAATGGTTCTCAGTCTGCTGTGGATGCT ATC AAAC AAAAAGGGAAATTAGTTGT 
GGC AAC C AG TC CTG ACT ATGC AC CC TTTG AATT TCAATCATTGG TTG ATGGAAAG AAC C AGGT AGT CGG 
TGC AG AC ATCG AC ATGGCTC AGG CT ATC GCTGATGAACTTGGGGTT AAG TTGG AAATCTC AAGC ATG AG 
TTTTGACAATG^TTTGACCAGTCTTCAAACTGGTAAGGCTGACCTAGCAGTTGCAGGAATTAGTGCTAC 
TG ACG AG AG AAAAG AAGTC TTTG ATTTTTCAATCCC AT ACT ATG AAAAC AAG ATT AGTTTCTTGGTTCG 
TAAGGCTGATGTGGAAAAATACAAGGATTTAACTAGCCTAGAAAGTGCTAATATTGCAGCCCAAAAAGG 
G AC TGTTCC AG AATCAATGGTC AAGG AAC AATTG CC AAAAG TTC AATT AACTTCCCT AAC TAATATGGG 
TGAAGCAGTCAATGAATTGCAGGCTGGAAAAATAGATGCTGTTCATATGGATGAGCCTGTTGCACTTAG 
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TTATGCTGCTAAAAACGCTGGCTTAGCTGTCGCAACTGTCAGCTTGAAGATGAAGGACGGCGACGCCAA 
TGCC 

SP121 amino acid (SEQ ZD NO: 214) 

CQSGSNGSQSAVDAIKQKGKLWATSPDYAPFEFQSLVDGKNQWGADIDMAQAIADELGVXLEISSMS 
FDNVLTSLQTGKADLAVAGI SATDERKEVFDFS I PYYENKI SFLVRKADVEKYKDLTSLESANIAAQKG 
TVTESMVKEQLPKVQLTSLTNMGEAVNEI^AGKIDAVHMDEPVALSYA 
A 

SP122 nucleotide (SEQ ID NO:215) 

GGAAACTTCACAGGATTTTAAAGAGAAGAAAACAGCAGTCATTAAGGAAAAAGAAGTTGTTAGTAAAAA 

TCCTGTGATAGACAATAACACTAGCAATGAAGAAGCAAAAATCAAAGAAGAAAATTCCAATAAATCCCA 

AGGAGATTATACGGACTCATTTGTGAATAAAAACACAGAAAATCCCAAAAAAGAAGATAAAGTTGTCTA 

TATTGCTGAATTTAAAGATAAAGAATCTGGAGAAAAAGCAATCAAGGAACTATCCAGTCTTAAGAATAC 

AAAAGTTTTATATACTTATGATAGAATTTTTAACGGTAGTGCCATAGAAACAACTCCAGATAACTTGGA 

CAAAATTAAACAAATAGAAGGTATTTCATCGGTTGAAAGGGCACAAAAAGTCCAACCCATGATGAATCA 

TGC C AG AAAGG AAATTGG AGTTG AGG AAGC T ATTG ATT AC C T AAAG TCT ATC AATGC TC CGTTTGGG AA 

AAATTTTG ATGG T AG AGGT ATGGTC ATTTC AAAT ATC G AT ACTGG AAC AG ATT AT AGAC AT AAGGCT AT 

GAG AATC G ATG ATG ATGC C AAAGC C TC AATG AG ATTT AAAAAAG AAG ACTT AAAAGGC ACTG AT AAAAA 

TTATTGGTTGAGTGATAAAATCCCTCATGCGTTCAATTATTATAATGGTGGCAAAATCACTGTAGAAAA 

ATATGATGATGGAAGGGATTATTTTGACCCACATGGGATGCATATTGCAGGGATTCTTGCTGGAAATGA 

TACTG AAC AAG AC ATC AAAAACTTT AACGGC ATAGATGG AATTGC AC CTAATGC AC AAATTTTC TCTT A 

C AAAATGT ATTCTG ACGC AGG ATCTGGGTTTGCGGGTG ATG AAAC AATGTTTC ATGC T ATTG AAG A TTC 

TATCAAACACAACGTTGATGTTGTTTCGGTATCATCTGGTTTTACAGGAACAGGTCTTGTAGGTGAGAA 

ATATTGGCAAGCTAT^CGGGCATTAAGAAAAGCAGGCATTCCAATGGTTGTCGCTACGGGTAACTATGC 

GAC TTC TGC TTC AAGTTCTTCATGGG ATTT AGTAGC AAAT AATC ATC TGAAAATG AC CG AC AC TGG AAA 

TGT AAC ACGAACTGC AGC AC ATG AAG ATGCGAT AGC GGTC GC TTCTGCT AAAAATC AAAC AG T TG AGTT 

TGATAAAGTTAACATAGGTGGAGAAAGTTTTAAATACAGAAATATAGGGGCCTTTTTCGATAAGAGTAA 

AATCACAACAAATGAAGATGGAACAAAAGCTCCTAGTAAATTAAAATTTGTATATATAGGCAAGGGGCA 

AGACCAAGATTTGATAGGTTTGGATCTTAGGGGCAAAATTGCAGTAATGGATAGAATTTATACAAAGGA 

TTTAAAAAATGCTTTTAAAAAAGCTATX^ATAAGGGTGCACGCGCC ATT ATGGTTGT AAAT ACTG 

TTACTACAATAGAGATAATTGGACAGAGCTTCCAGCTATGGGATATGAAGCGGATGAAGGTACTAAAAG 

TC AAGTGTTTTC AATTTC AGG AG ATG ATGGTGT AAAG CT ATGG AAC ATG A TT AATC CTG AT AAAAAAAC 

TG AAGTC AAAAG AAAT AAT AAAG AAG ATTTT AAAG AT AAATTGG AGC AAT ACT ATC CAATTG AT ATGG A 

AAGTTTTAATTCC AAC AAACC G AATG T AGGTG ACG AAAAAG AG ATTG ACTTT AAGTTTGC AC C TG AC AC 

AGACAAAGAACTCTATAAAGAAGATATCATCGTTCCAGCAGGATCTACATCTTGGGGGCCAAGAATAGA 

TTTACTTTTAAAACCCGATGTTTCAGCACCTGGTAAAAATATTAAATCCACGCTTAATGTTATTAATGG 

CAAATCAACTTATGGCTATATGTCAGGAACTAGTATGGCGACTCCAATCGTGGCAGCTTCTACTGTTTT 

G ATT AG AC CG AAATT AAAGG AAATGCTTG AAAG AC C TGT ATTG AAAAATCTT AAGGG AG ATG AC AAAAT 

AG ATC TT AC AAGTCTT AC AAAAATTGC C C T AC AAAAT ACTGCGC GACCT ATG ATGG ATGC AACTTC TTG 

G AAAG AAAAAAGTC AAT ACTTTGC ATC AC CT AG AC AAC AGGG AGC AGGCCTAATT AATG TGG CC AATGC 

TTTGAGAAATGAAGTTGTAGCAACTTTCAAAAACACTGATTCTAAAGGTTTGGTAAACTCATATGGTTC 

CATTTCTCTT AAAG AAAT AAAAGGTG AT AAAAAAT ACTTT AC AATC AAGC TTC AC AAT AC ATC AAAC AG 

ACCTTTGACTTTTAAAGTTTCAGCATCAGCGATAACTACAGATTCTCTAACTGACAGATTAAAACTTGA 

TG AAAC AT AT AAAG ATG AAAAATCTC C AG ATGGT AAGC AAATTG TTCC AG AAATTCAC C C AGAAAAAGT 

CAAAGGAGC AAAT ATC AC ATTTG AGC ATG AT ACTTTC ACT AT AGGCGC AAATTC T AGCTTTGATTTG AA 

TGCGGTTATAAATGTTGGAGAGGCCAAAAACAAAAATAAATTTGTAGAATCATTTATTCATTTTGAGTC 

AGTGGAAGCGATGGAAGCTCTAAACTCCAGCGGGAAGAAAATAAACTTCCAACCTTCTTTGTCGATGCC 

TCTAATGGGATTTGCTGGGAATTGGAACCACGAACCAATCCTTGATAAATGGGCTTGGGAAGAAGGGTC 

AAGATCAAAAACACTGGGAGGTTATGATGATGATGGTAAACCGAAAATTCCAGGAACCTTAAATAAGGG 

AATTGGTGG AG AAC ATGGTAT AG AT AAATTT AATC C AGC AGG AGTT AT AC AAAAT AG AAAAG AT AAAAA 

TACAACATCCCTGGATCAAAATCCAGAATTATTTGCTTTCAATAACGAAGGGATCAACGCTCCATCATC 

AAGTGGTTCTAAGATTGCTAACATTTATCCTTTAGATTCAAATGGAAATCCTCAAGATGCTCAACTTGA 

AAGAGGATTAACACCTTCTCCACTTGTATTAAGAAGTGCAGAAGAAGGATTGATT 

SP122 amino acid (SEQ ZD NO: 216) 

ETSQDFKEKKTAVIKEKEWSKNPVIDNNTSNEEAKIKEENSNKSQGDYTDSFVM 
IAEFKDKESGEKAIKELSSLKNTK\^YTYDRIFNGSAIETrPDNLJDKIKQIEGISSVERAQKVQPMM^ 
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ARKEIGVEEAIDYLKSINAPFGKNFDGRGMVISNIDTGTDYRHKAMRIDDDAKASMRFKKEDLKGTDKN 
YWLSDKIPHAFNYYNGGKIWEKYDDGRDYFDPHGMHIAGIIAGNOTEQDIKNFNGI 
KMYSDAGSGFAGDETMFHAI EDS IKHNVDWSVSSGFTGTGLVGEKYWQAIRALRKAGI PMWATGNYA 
TSASSSSWDLVA^LKMTOTGNVTRTAAHEDAIAVAS^^ 

ITTOEIX3TKAPSKLKF\AriGKGQDODLIGLDLRGKIAVMDRIYTKDLKNAFKKAMDKGARAIW 
YYNRDNWTEL P AMG YEADEGTKSQVF S I SGDDGVKLWNMINPDKKTEVKRNNKEDFKDKLEQYYP I DME 
SFNSNKPNVGDEKEIDFKFAPDTDKELYKEDIIVPAGSTSWGPRIDLLLKPDVSAPGKNIKSTLNVING 
KSTYGYMSGTSMATPIVAASTVLIRPKLKEMLERPVLKNLKGDDKIDLTSLTKIALQNTARPMMDATSW 
KEKSQYFAS PRQQGAGLINVANALRNEWATFKNTDSKGLVNS YGS I S LKEI KGDKKYFTI KLHNTSNR 
PLTFKVSASAITTDSLTDRLKLDETYKDEKSPDGKQIVPEIHPEKVKGANITFEHDTFTIGANSSFDLN 

AVINVGEAKNKNKFTOSFIHFESVEAMEJUNSSGKK^ 

RSKTLGGYDDDGKPKIPGTLNKGIGGEHGIDKFNPAGVIQNRKDKNTTSLDQNPELFAFNNEGINAPSS 
SGSKIANIYPLDSNGNPQDAQLERGLTPSPLVLRSAEEGLI 

SP123 nucleotide (SEQ ID NO:217) 

TGTGGTCG AAGTTG AG ACTC CTC AATC AAT AAC AAATC AGG AGC AAG C T AGG AC AG AAAAC C AAGT AGT 
AGAGACAGAGGAAGCTCCAAAAGAAGAAGCACCTAAAACAGAAGAAAGTCCAAAGGAAGAACCAAAATC 
GGAGGTAAAACCTACTGACGACACCCTTCCTAAAGTAGAAGAGGGGAAAGAAGATTCAGCAGAACCAGC 
TCCAGTTG AAG AAGT AGGTGG AG AAGTTG AGTC AAAAC C AG AGG AAAAAGT AGC AGT TAAGCCAGAAAG 
TCAACCATCAGACAAACCAGCTGAGGAATCAAAAGTTGAACAAGCAGGTGAACCAGTCGCGCCAAGAGA 
AG AC G AAAAGGC AC C AGTC G AGC C AG AAAAGC AACC AG AAGCTC C TG AAG AAG AG AAGGCTGT AG AGG A 
AAC AC CG AAAC AAG AAG AGTC AAC TCC AG ATACCAAGGCTG AAG AAACTGT AG AACCAAAAG AGG AG AC 
TGTTAATCAATCTATTGAACAACCAAAAGTTGAAACGCCTGCTGTAGAAAAACAAACAGAACCAACAGA 
GGAACCAAAAGTTGAACAAGCAGGTGAACCAGTCGCGCCAAGAGAAGACGAACAGGCACCAACGGCACC 
AGTTG AGC C AG AAAAGC AACC AGAAGTTCC TG AAG AAG AG AAGGCTGT AG AGG AAAC ACCG AAAC C AG A 
AG AT AAAAT AAAGGGT ATTGGT AC T AAAG AAC C AGTTG AT AAAAGTG AG TT AAATAATC AAATTG AT AA 
AGCTAGTTCAGTTTCTCCTACTGATTATTCTACAGCAAGTTACAATGCTCTTGGACCTGTTTTAGAAAC 
TGCAAAAGGTGTCTATGCTTCAGAGCCTGTAAAACAGCCTGAGGTAAATAGCGAGACAAATAAACTTAA 
AACGGC T A TTGACGCTC TAAACGTTG AT AAAAC TG AATT AAAC AAT AC G AT TGC AG ATGC AAAAAC AAA 
GGTAAAAGAACATTACAGTGATAGAAGTTGGCAAAACCTCCAAACTGAAGTTACAAAGGCTG AAAAAGT 
TGCAGCTAATACAGATGCTAAACAAAGTGAAGTTAACGAAGCTGTTGAAAAATTAACTGCAACTATTGA 
AAAATTGGTTG AATT ATCTG AAAAG CC AAT ATT AAC ATTG AC TAG TACC GAT AAG AAAAT ATTGGAAC G 
TGAAGCTGTTGCT AAGT AT ACTCT AG AAAATC AAAAC AAAAC AAAAATC AAATC AATC AC AGCTG AATT 

G AAAAAAGG AG AAG AAG TT ATT AAT AC TGTAGTC C TT AC AGATG AC AAGGT AAC AAC AG AAACT AT AAG 
CGCTGCAT T TAAGAACCTAGAGTACTACAAAGAATACACCCTATCTACAACTATGATTTACGACAGAGG 
TAACGGTGAAGAAACTGAAACTCTAGAAAATCAAAATATTCAATTAGATCTTAAAAAAGTTGAGCTTAA 
AAATATTAAACGTACAGATTTAATCAAATACGAAAATGGAAAAGAAACTAATGAATCACTGATAACAAC 
TATTCCTGATG AT AAG AGC AATT ATT ATTTAAAAATAACTTC AAATAATC AG AAAAC T AC AT T AC T AGC 
TGTTAAAAATATAGAAGAAACTACGGTTAACGGAACACCTGTATATAAAGTTACAGCAATCGCAGACAA 
TTTAGTCTCTAGAACTGCTGATAATAAATTTGAAGAAGAA 

SP123 amino acid (SEQ ID NO:218) 

WEVETPQSITNQEQARTENQWETEEAPKEEAPKTEESPKEEPKSEVKPTDDTLPKVEEGKEDSAEPA 
PVEEVGGEVESKPEEKVAVKPESQPSDKPAEESKVEQAGEPVAPREDEKAPVEPEKQPEAPEEEKAVEE 
TPKQEESTPDTKAEETVEPKEETVNQSIEQPKVETPAVEKQTEPTEEPKVEQAGEPVAPREDEQAPTAP 
V^PEKQPETVPEEEKAVEETPKPEDKIKGIGTKEPVDKSELNNQIDKASSVSPTDYSTASYNALGPVLET 
AKGVTASEPVKQPEVNSETNKLKTAIDALNTOKTELNNTIADAKTKVXEHYSDR 

AANTDAKQS EVNEAVEKLTATI EKLVELSEKP I LTLTSTDKKI LEREAVAKYTLENQNKTKI KS ITAEL 
KKGEEVINTVVLTDDKVTTETISAAFKNLEYYKEYTLSTTMIYDRGNGEETETLENQNIQLDLKKVELK 
NIKRTDLIKYENGKETNESLITTIPDDKSNYTLKITSNNQKTTLLAVKNIEETTVNGTPVYKVTAIADN 

LVSRTADNKFEEE 

SP124 amino acid (SEQ ID NO: 219) 

AAC AC C TGT AT AT AAAGTT AC AGC AATCGC AG AC AATTT AGTCTCT AG AACTGC TG AT AAT AAATTTG A 
AG AAG AAT ACG TT C ACT AT ATTG AAAAAC CT AAAGTCC ACG AAG AT AATGT AT ATT AT AATTTC AAAG A 
ATT AGTGG AAGGT ATTC AAAAC GATCC TTCAAAAGAATATC GTCTGGG AC AATC AATG AGC GCT AG AAA 
TGTTGTTCCTAATGGAAAATCATATATCACTAAAGAATTCACAGGAAAACTTTTAAGTTCTGAAGGAAA 
ACAATTTGCTATTACTGAATTGGAACATCCATTATTTAATGTGATAACAAACGCAACGATAAATAATGT 
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GAATTTTGAAAATGTAGAGATAGAACGTTCTGGTCAAGATAATATTGCATCATTAGCCAATACTATGAA 
AGGTTCTTCAGTTATTACAAATGTCAAAATTACAGGCACACTTTCAGGTCGTAATAATGTTGCTGGATT 
TGTAAATAATATGAATGATGGAACTCGTATTGAAAATGTTGCTTTCTTTGGCAAACTACACTCTACAAG 
TGGAAATGGCTCTCATACAGGGGGAATTGCAGGTACAAACTATAGAGGAATTGTTAGAAAAGCATATGT 
TGATGCTACTATTACAGGAAACAAAACACGCGCCAGCTTGTTAGTTCCTAAAGTAGATTATGGATTAAC 
TCTAGACCATCTTATTGGTACAAAAGCTCTCCTAACTGAGTCGGTTGTAAAAGGTAAAATAGATGTTTC 
AAATCCAGTAGAAGTTGGAGCAATAGCAAGTAAGACTTGGCCTGTAGGTACGGTAAGTAATTCTGTCAG 
CTATGCTAAGATTATCCGTGGAGAGGAGTTATTCGGCTCTAACGACGTTGATGATTCTGATTATGCTAG 
TGCTCATATAAAAGATTTATATGCGGTAGAGGGATATTCGTCAGGTAATAGATCATTTAGGAAATCTAA 
AACATTTACTAAATTAACTAAAGAACAAGCTGATGCTAAAGTTACTACTTTCAATATTACTGCTGATAA 
ATTAGAAAGTGATCTATCTCCTCTTGCAAAACTTAATGAAGAAAAAGCCTATTCTAGTATTCAAGATTA 
TAACGCTGAATATAACCAAGCCTATAAAAATCTTGAAAAATTAATACCATTCTACAATAAAGATTATAT 
TGTATATCAAGGTAATAAATTAAATAAAGAACACCATCTAAATACTAAAGAAGTTCTTTCTGTTACCGC 
GATGAACAACAATGAGTTTATCACAAACCTAGATGAAGCTAATAAAATTATTGTTCACTATGCGGACGG 
TACAAAAGATTACTTTAACTTGTCTTCTAGCAGTGAAGGTTTAAGTAATGTAAAAGAATATACTATAAC 
TGAC TT AGG AATT AAAT AT AC AC CT AATATCGTTC AAAAAG AT AAC ACT ACTCTTGTT AATG ATAT AAA 
ATCT ATTTT AG AATC AGT AG AGCTTC AGTCTC AAAC G ATGT ATC AG C ATCT AAATCG ATT AGGTG AC T A 
TAGAGTTAATGCAATCAAAGATTTATATTTAGAAGAAAGCTTCACAGATGTTAAAGAAAACTTAACAAA 
CCTAATCACAAAATTAGTTCAAAACGAAGAACATCAACTAAATGATTCTCCAGCTGCTCGTCAAATGAT 
TCGTGATAAAGTCGAGAAAAACAAAGCAGCTTTATTACTAGGTTTAACTTACCTAAATCGTTACTATGG 
AGTT AAATTTGGTGATGTTAAT ATT AAAG AATT AATGCTATTC AAAC C AG ATTTCTATGGTGAAAAAGT 
TAGCGTATTAGACAGATTAATTGAAATCGGTTCTAAAGAGAACAACATTAAAGGTTCACGTACATTCGA 
CGCATTCGGTCAAGTA 

SP124 amino acid (SEQ ID NO:220) 

TPWKVTAIADNLVSRTADNKFEEEYVHYIEKPKVHEDNn/YYNFKELVEAIQNDPSKEYRLGQSMSARN 
WPNGKSYITKEFTGKLLSSEGKQFAITELEHPLFOTITNATINNVNFENVEIERSGQDNIASLANTMK 
GSSVITWKITGTISGRNKVAGFVNNMM}^ 

DATITGNKTRASLLVPKVDYGLTLDHLIGTKALLTESWKGKIDVSNPVEVGAIASKTWPVGTVSNSVS 
YAKIIRGEELFGSNDVDDSDYASAHIKDLYAVEGYSSGNRSFRKSKTFTKLTKEQADAKVTTFNITADK 
LESDLSPLAKLNEEKAYSSIQDYNAEYNQAYKNLEKLIPFYNKDYIWQGNKLNKEHHLNTKEVLSVTA 
MNNNEFITNLDE^^KIIVHYAIXSTKDYFNLSSSSEGLSNVKEYTITDLGIKYTPNIVQKD^TLVNDIK 
SILESVELQSQTKYQHLNRLGDYRVNAIKDLYLEESFTDVKENLTNLITKLVQNEEHQLNDSPAARQMI 
RI5KVEKNKAALLLGLTYLNRYYGVKFGDVNIKELMLFKPDFYGEKVSVLDRLIEIGSKENNIKGSRTFD 

AFGQV 

SP125 nucleotide (SEQ ID NO:221) 

ATTAGACAGATTAATTGAAATCGGTTCTAAAGAGAACAACATTAAAGGTTCACGTACATTCGACGCATT 

CGGTC AAGT ATTGG CT AAAT AT AC T AAATC AGGT AATTT AG ATG C ATTTTTAAATT AT AAT AG AC AATT 

GTTCACAAATATAGACAATATGAACGATTGGTTTATTGATGCTACAGAAGACCATGTCTACATCGCAGA 

ACG CG CTTCTG AGG TC GAAGAAATT AAAAATTC T AAAC ATCGTGC ATTC GAT AATTT AAAAC G AAGTC A 

C C TT AG AAAT ACT AT ACT C C C ACT AC TG AAT ATTG AT AAAGC AC ATC TTT ATTT AATTTC AAATT AT AA 

TGCAATTGCCTTTGGTAGTGCAGAGCGATTAGGTAAAAAATCATTAGAAGATATTAAAGATATCGTTAA 

CAAAGCTGCAGATGGTTATAGAAACTATTATGATTTCTGGTATCGTCTAGCGTCTGATAACGTTAAACA 

ACG ACT ACT AAG AG ATGCTG TT ATTCC T ATTTGGG AAGGTT AT AACG CTC CTGGTGG ATGGGTTG AAAA 

ATATGGCCGCTATAATACCGACAAAGTATATACTCCTCTTAGAGAATTCTTTGGTCCTATGGATAAGTA 

TT AT AATT AT AATGG AAC AGG AGCTT ATGC TGCT AT AT ATC CT AAC TCTG ATGAT ATT AGAAC TG ATGT 

AAAATATGTTCATTTAGAAATGGTTGGTGAATACGGTATTTCAGTTTACACACATGAAACAACACACGT 

CAACGACCGTGCGATTTACTTAGGTGGCTTTGGACACCGTGAAGGTACTGATGCTGAAGCATATGCTCA 

GGGTATGCTACAAACTCCTGTTACTGGTAGTGGATTTGATGAGTTTGGTTCTTTAGGTATTAATATGGT 

ATTTAAACGCAAAAATGATGGGAATCAGTGGTATATTACAGATCCAAAAACTCTAAAAACACGAGAAGA 

TATTAATAGATATATGAAGGGTTATAATGACACTTTAACTCTTCTTGATGAAATTGAGGCTGAATCTGT 

GATTTCTCAACAAAATAAAGATTTAAATAGTGCATGGTTCAAAAAAATAGATAGAGAATACCGTGATAA 

C AAT AAATT AAATC AATGGG AT AAAATTCGAAATCT AAGTC AAG AAG AG AAAAATG AATT AAAT ATTC 

ATC TGTT AATG ATTT AGTTG ATC AAC AATT AATG ACT AATCGCAATCC AGGT AATGGT ATCT AT AAAC C 

C GAAGC AATT AGC TAT AACG ATC AATC AC C TT ATGT AGGTGTT AG AATG ATGAC CGGT ATCT ACGG AGG 

TAATACTAGTAAAGGTGCTCCTGGAGCTGTTTCATTCAAACATAATGCTTTTAGATTATGGGGTTACTA 

CGGATACGAAAATGGGTTCTTAGGTTATGCTTCAAATAAATATAAACAACAATCTAAAACAGATGGTGA 
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GTCTGTTCTAAGTGATGAATATATTATCAAGAAAATATCTAACAATACATTTAATACTATTGAAGAATT 
TAAAAAAGCTTACTTCAAAGAAGTTAAAGATAAAGCAACGAAAGGATTAACAACATTCGAAGTAAATGG 
TTCTTCCGTTTCATCATACGATGATTTACTGACATTGTTTAAAGAAGCTGTTAAAAAAGATGCCGAAAC 
TCTTAAACAAGAAGCAAACGGTAATAAAACAGTATCTATGAATAATACAGTTAAATTAAAAGAAGCTGT 
TTATAAGAAACTTCTTCAACAAACAAATAGCTTTAAAACTTCAATCTTTAAA 

SP125 amino acid (SEQ ID NO: 222) 

ldrlieigskennikgsrtfdafgqvi^ 
raseveeiknsk>:rafdnlkrshlrntilpllnidkahly^ 

kaadgyrnyydfwyklasdnvkqrllrdavi p iwegynapggwekygryntdkvytplreffgpmdky 

YNYNGTGAYAAIYPNSDDIRTDVKYVHLEMVGEYGISVYTHETTHVNDRAIYLGGFGHREGTDAEAYAQ 
GMLQTPVTGSGFDEFGSLGINMVFKRKNDGNQVOTTDPKTLKT^^ 

ISQQNKDLNSAWFKKIDREYRDNNKI^QWDKIRNLSQEEKNEI^IQSVNDLVDQQLMTNRN 
EAISYNIX2SPWGVRMMTGIYGGNTSKGAPGAVSFKHNAFRLWGYYGYENGFIX;YASNKYXQQSKTDGE 
SVLSDEYIIKKISNOTTNTIEEFKKAYFKEVKDKATKGLTTFETO^ 
LKQEANGNKTVSMNNTVKLKEAVYKKLLQQTNSFKTSIFK 

5P126 nucleotide (SEQ ID NO: 223) 

TAAGACAGATGAACGGAGCAAGGTGTTTGACTTTTCCATTCCCTACTATACTGCAAAAAATAAACTCAT 
TGTCAAAAAATCTGACTTGACTACTTATCAGTCTGTAAACGACTTGGCGCAGAAAAAGGTTGGAGCGCA 
GAAAGGTTCGATTCAAGAGACGATGGCGAAAGA7TTGCTACAAAATTCTTCCCTCGTATCTCTGCCTAA 
AAATGGG AATTT AATC AC AG ATTT AAAATC AGG AC AAGTGG ATGC CGTT ATC TT TG AAG AACCTGTTTC 
CAAGGGATTTGTGGAAAATAATCCTGATTTAGCAATCGCAGACCTCAATTTTGAAAAAGAGCAAGATGA 
TTCCTACGCGGTAGCCATgAAAAAAGATAGCAAGAAATTGAAGAGGCAGTTCGATAAAACCATTCAAAA 
GTTGAAGGAGTCTGGGGAATTAGACAAACTCATTGAGGAAGCCTTA 

SP126 amino acid (SEQ ID NO: 224) 

KTDERSKVFDFSIPYYTAKNKLIVKKSDLTTYQSVNDLAQKKVGAQKGSIQETMAKDLLQNSSLVSLPK 
NGNLITDLKSGQVDAVIFEEPVSKGFVENNPDLAIADLNFEKEQDDSYAVAMKKDSKKLKRQFDKTIQK 

LKESGELDKLI EEAL 

SP127 nucleotide (SEQ ID NO:225) 

CTGTGAGAATCAAGCTACACCCAAAGAGACTAGCGCTCAAAAGACAATCGTCCTTGCTACAGCTGGCGA 
CGTGCCACCATTTGACTACGAAGACAAGGGCAATCTGACAGGCTTTGATATCGAAGTTTTAAAGGCAGT 
AGATGAAAAACTCAGCGACTACGAGATTCAATTCCAAAGAACCGCCTGGGAGAGCATCTTCCCAGGACT 
TGATTCTGGTCACTATCAGGCTGCGGCCAATAACTTGAGTTACACAAAAGAGCGTGCTGAAAAATACCT 
TTACTCGCTTCCAATTTCCAACAATCCCCTCGTCCTTGTCAGCAACAAGAAAAATCCTTTGACTTCTCT 
TGACCAGATCGCTGGTAAAACAACACAAGAGGATACCGGAACTTCTAACGCTCAATTCATCAATAACTG 
G AATC AG AAAC AC ACTG AT AATC CCGC T AC AATT AATTTTTCTGGTG AGG AT ATTGGT AAACG AATCC T 
AGACCTTGCTAACGGAGAGTTTGATTTCCTAGTTTTTGACAAGGTATCCGTTCAAAAGATTATCAAGGA 
CCGTGGTTTAGACCTCTCAGTCGTTGATTTACCTTCTGCAGATAGCCCCAGCAATTATATCATTTTCTC 
AAGCGACCAAAAAGAGTTTAAAGAGCAATTTGATAAAGCGCTCAAAGAACTCTATCAAGACGGAACCCT 
TGAAAAACTCAGCAATACCTATCTAGGTGGTTCTTACCTCCCAGATCAATCTCAGTTACAA 

SP127 amino acid (SEQ ID NO: 226) 

CENQATPKETSAQKTIVLATAGDVPPFDYEDKGNLTGFDIEVLKAVDEKLSDYEIQFQRTAWESIFPGL 
DSGHYQAAANNL3YTKERAEKYLYSLPISNNPLVLVSNKKNPLTSLDQIAGKTTQEDTGTSNAQFINNW 
NQKHTDNPATINFSGEDIGKRILDLANGEFDFLVFDKVSVQKIIKDRGLDLSWDLPSADSPSNYIIFS 
SDQKEFKEQFDKALKELYQDGTLEKLSNTYLGGSYLPDQSQLQ 
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5. pnoumoniao Antigonic Bpitopoo 

SP001 

Lys-1 to Ile-10; Leu-13 to Lys-32; Arg-41 to Ile-51; Ser-85 to Glu-97; 
Ala-159 to His-168; Val-309 to Thr-318; Val-341 to Asn-352; Asn-415 to 
Met-430; Phe-454 to Asn-464; Ser-573 to Gly-591; Asn-597 co Thr-641; 
and Asn-644 to Ala-664. 

SP004 

Thr-9 to Thr-24; Ile-29 to Ala-48; Thr-49 to Val-56; Val-286 to Val- 
312; 

Pro-316 to Glu-344; Val-345 to Ile-367; Gln-368 to Val-399; Ser-400 to 
Glu-431; Asn-436 to Ala-457; Ile-467 to Ala-498; and Thr-499 to Glu- 
540. 

SP006 

Glu-1 to Lys-13; Pro-24 to Gly-36; Val-104 to Thr-112; Ala-118 to Asn- 
130; Trp-137 to Ala-146; Ser-151 to Ile-159; Ile-181 to Leu-188; and 
Pro-194 to Tyr-202. 

SP007 

Gly-1 to Asn-7; Tyr-24 to Gln-34; His-47 co Phe-55; Ser-60 to Ala-67; 
Ala-122 to Leu-129; Leu-221 to Lys-230; Val-236 to Phe-256; and Asp-271 
to Gly-283; and Leu-291 to Asp-297. 

SP008 

Leu-4 to Lys-17; Gln-24 to Leu-32; Asp-60 to Ser-66; Ser-70 to Asp-76; 
Ala-276 to Lys-283; Asn-304 to Lys-311; and Thr-429 to Pro-437. 



SP009 

Thr-4 to Glu-11; Leu-50 to Asp-60; 
Ile-157. 



Ile-102 co Trp-123; and Ser-138 to 



SP010 

Phe-34 to Gly-41; Asp-44 to Lys-50; Leu-172 to Val-186; Leu-191 to Val- 
198; Ser-202 co Ile-209; and Val-213 to Leu-221. 

spoil 

Asn-2 to Thr-10; Asp-87 to Ala-102; Tyr-125 to Glu-132; Thr-181 to Tyr- 
189; Arg-217 co Thr-232; Asn-257 co Lys-264; Pro-271 to Ser-278; Tyr- 
317 to Ala-325'; Glu-327 to Pro-337; and Thr-374 to Val-381. 

SP012 

Gly-1 to Lys-19; Phe-34 to Tyr-41; Leu-109 co Lys-126; and Leu-231 to 
Glu-247 . 

SP013 

Ala-1 to Lys-12; Ile-42 to Pro-53; Leu-138 to Lys-146; Ile-205 to Lys- 
217; Ser-235 co Ile-251; and Ser-261 to Tyr-272 . 

SP014 

Gly-1 to Val-16; Leu-35 to Leu-44; Asp-73 to Asp-81; Ile-83 to Asp-92; 
Glu-145 to Ile-153; Phe-188 to Asn-196; Ser-208 to Phe-215; Ile-224 to 
Leu-231; and Asn-235 to Ala-243. 

SP015 

Ser-1 to Pro-15; Asn-78 to Glu-88; Ala-100 to Val-108*; Ala-122 to Thr- 
129; Thr-131 to Ser-137; Leu-201 Co Ser-220; and Gly-242 to Val-251. 
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SP016 

Gly-1 to Glu-20; Thr-30 to Val-38; Gln-94 to Asn-105; Lys-173 to Pro- 
182; Gly-189 to Arg-197; Ser-207 to Val-224; Pro-288 to Leu-298; Ala- 
327 to Ala-342; and Ser-391 to Ala-402 . 

SP017 

Ser-1 to Thr-12; Ala-36 to Tyr-45; Gln-48 to Ile-54; Lys-59 to Lys-76; 
Tyr-113 to Leu-138; and Phe-212 to Asp-219.. 

SP019 

Val-97 to Glu-117; Asp-163 to Leu-169; Thr-182 to Thr-191; and Lys-241 
to Ser-250. 

SP020 

Asn-18 to Lys-25; Thr-47 to Glu-60; Trp-75 to Val-84; Gly-102 to Val- 
110; Pro-122 to Ala-131; and Glu-250 to Pro-258. 

SP021 

Serl to Asp-S; Val-44 to Asp-54; Ala-117 to Val-125; Thr-165 to Thr- 
173; and Glu-180 to Pro-189. 

SP022 / 

?he-5 to Lys-13; Thr-20 to Ser-36; Glu-59 to Lys-81; Tyr-85 to Gly-93; 
Trp-94 to Trp-101; and Thr-195 to Trp-208. 

SP023 

Gln-45 to Glu-59; Asp-69 to Pro-85; Lys-111 to Asn-121; Pro-218 to Ala- 
228; and Glu-250 to Asn-281. 

SP025 

Gln-14 to Thr-20; Gly-27 to Phe-33; Gly-63 to Glu-71; and Ile-93 to 
Phe-102. 

SP028 

. Asp-171 to Pro-179; Tyr-340 co Glu-350; Pro-455 to Tyr-463; and Asp-474 
to Pro-480. 

SP030 

Leu-22 to Leu-37; Trp-31 to Ala-90; Phe-101 to Ala-106; Thr-124 to Tyr- 
130; and Asn-138 to Glu-144. 

5P031 

Asp-8 to Val-16; Gly-27 to Thr-35; Gly-178 to Asp-195; Thr-200 to 
Asp209; Trp-218 to Leu-224; and Lys-226 to Asp-241. 

SP032 

Ser-9 to Asp-28; Phe-31 to Val-40; Gly-42 to Arg-50; Ile-52 to Leu-60; 
Asp-174 to Phe-186; Leu-324 to Met-333; and Thr-340 to Asn-347. 

SP033 

Gln-2 to Ile-13; Phe-46 to Ile-53; and Asp-104 to Thr-121. 

Glu-36 to Gly-43; Ala-188 to Asp-196; Trp-313 to Gly-320; and Leu-323 
to Leu-329. 



BNSDOC1D <WO 9818930A2_I_j 



WO 98/18930 PCT/US97/19422 

98 

Yoblo 2 

S. pneumoniae Antigenic Epitopes 



SF035 

Arg-19 to Asp-36; Asp-47 to Val-57; Asn-134 to Thr-143; Asp-187 to Arg- 
196; and Glu-222 to Ser-230. 

SP036 

Arg-10 to Arg-17; Lys-29 to Ser-39; Ser-140 to Ala-153; Arg-158 to Tyr- 
169; Asp-175 to Ala-183; Gly-216 to Asn-236; Ala-261 to Leu-270; Arg- 
282 to Phe-291; and Thr-297 to Ala-305; Pro-342 to Gln-362; Phe-455 to 
Asp-463; His-497 to Thr-511; Ala-521 to Gly-529; Ile-537 to Val-546; 
Ile-556 to Ala-568; Pro-581 to Ser-595; Glu-670 to Aia-685; Ser-696 to 
Ala-705 and Leu-782 to Ser-791. 

SP038 

Glu-61 to Pro-69; Phe-107 to Ala-115; Leu-130 to Tyr-141; Ala-229 to 
Glu-237; Ser-282 to Asn-287; AIa-330 to Glu-338; and Tyr-387 to Glu- 
393. 

SP039 

Ser-28 to Ase-35; Pro-83 to Pro-96; Leu-125 to Arg-135; Phe-149 to Leu- 
157; Gln-246*to Val-254; Ala-357 to Thr-362; Gly-402 to Lys-411; and 
Leu-440 to Pro-448. 

/■ 

SP040 

Thr-21 to Ile-30; His-54 to Gln-68; Arg-103 to Leu-117; and Thr-127 to 
Leu- 13 6. 

SP041 

Gly-36 to Asp-49; Leu-121 to Val-128; and Ala-186 to Ile-196. 
SP042 

Gly-11 to Arg-19; Ile-23 to Lys-31; Kxs-145 to Asn-151; Gln-159 no Asp- 
166; Ile-175 to Asp-131; Gly-213 to Tyr-225; Ile-283 to Val-291; Pre- 
329 to Giu-364; Arg-372 to Ser-386; Thr-421 to Phe-430; Leu-445 to Va^- 
453; Ile-486 to Ala-497; Asp-524 to Ala-535; His-662 to Gly-674; and 
His-679 to Gln-702. 



o 



SP043 

Lys-2 to Asp-12; Val-58 to Asn-68; Ser-87 to Asp-95; and Asp-102 t 
Lys-117. 

SP044 

Gln-3 to Lys-11; Asp-37 to Tyr-52; Glu-171 to Leu-191; His-234 to Asn- 
247; and Asn-283 to Ala-291. 

SP045 

Tyr-52 to Ile-63; Asp-212 to Gln-227; Ser-315 to Thr-332; Leu-345 to 

Phe-354; Asp-362 to Val-370; Thr-518 to Asn-539; Ala-545 to Lys-559; 
and Val-601 to Pro-610. 

SP046 

Gln-9 to Ala-18; Glu-179 to Lys-186; Lys-264 to Glu-271; Gly-304 to 
Glu-17; Ser-503 to Asn-511; Asn-546 to Thr-553; and Asn-584 to Asp-591. 

SP048 
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Tyx-4 to Asp-25; Lys-33 to Val-70; Asp-151 to Thr-170; Asp-222 to 
Val-257; Thr-290 to Phe-301; and Gly-357 to Val-367. 

SP04 9 

Ala-23 to Arg-37; Tyr-85 to Gln-9.5; Glu-106 to Ile-118; Arg-131 to 
ILE-144; Gly-150 to Ser-162; and Ala-209 to Asp-218. 

57050 

Asp-95 to Glu-113; Gly-220 to Gly-228; Asn-284 to Glu-295; Thr-298 
to Val-315. 

SP051 

Lys-16 to Glu-50; lys-57 to Asn-104; Ser-158 to Trp-173; Asp-265 

to Pro-279 ; Val-368 to Tyr-386; Glu-420 to Ile-454; Pro-476 to 

Ile-516; Phe-561 to Gly-581; Thr-606 to Gly-664; and Glu-676 to 
Val-696. 

SP052 

Asn-41 to Tyr-60; Phe-80 to Glu-103; Ala-117 to Val-139; Ile-142 co 
Leu-155; Val-190 to Lys-212; Glu-276 to Phe-283; Arg-290 to Ser-299; 
Leu-328 to Val-351; Gly-358 to Thr-388; Glu-472 to Ala-483; Val-533 
to Asn-561; Asp-595 to Val-606; Glu-609 to Val-620; Glu-672 to Ser- 
691. , 

SP053 

Ala-62 to Val-101; Thr-147 to Leu-174; Lys-204 to Val-216; Gln-228 
to Val-262; Ser-277 to Gly-297; Thr-341 to Glyn-368; Thr-385 to Ala- 
409; Thr-414 to Ser-453; Asn-461 to Leu-490; Glu-576 to Thr-625; 
Gly-630 to Arg-639; and Asp-720 to Leu-740 . 

SP054 

Glu-7 to Val-28; and Tyr-33 to Glu-44. 
5P05 5 

Pro-3 to Val-18; Thr-21 to Lys-53; Val-34 to Lys-99; Ile-162 to Val- 
172; and Val-204 to Ser-241. 

SP056 

Val-34 to Tyr-41; Leu-47 to Glu-55; and Pro-57 to Gln-66. 
SP057 

Asp-1 to Val-25; Pro-29 to Ile-80; Asn-96 to Val-145; and Pro-150 to 
Glu-172. 

SP058 

Ala-64 to Thr-70; Leu-32 to His-138; and Val-228 to Asn-236. 
SP059 

Val-10 to Thr-24; Ser-76 to Pro-102; Ser-109 to Ile-119; Ser-124 to 
Val-130; Thr-186 to Ile-194; and Asn-234 to Ser-243 . 

SP060 

Leu-70 to Arg-76; and Val-79 to Ile-88. 
SP062 

Glu-14 to Lys-28; Ser-32 to Lys-45; and Glu-66 to Thr-74. 
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SP063 

Ile-10 co Val-25; Val-30 to Thr-40; Asp-44 to Pro-54; Asn-57 co Val- 
63; Pro-71 to Val-100; and Thr-105 to Thr-116 . 



SP064 

Pro-12 to Leu-32; Val-40 co Leu-63; Asp-95 to Ala-125; Ser-154 to 
Glu-184; Ser-314 to Glu-346; Asr.-382 to Val-393; Leu-463 to Gln-498; 
Asn-534 to Lys-548; and Lys-557 co Gly-605. 

SP065 

Asn-2 to Ile-12; Ala-39 to Thr-6I; and His-135 to Ala-155. 



SP067 

Gly-i to Thr-13; Asp-203 to Asn-213; and Gly-240 to Asp-253. 
SP068 

Ser-2 to Ser-12; Val-17 -„o Gln-25; and Lys-54 to Cys-67. 



SP069 

Ser-32 to Thr-41; Pro-65 to Glu-80; Thr-110 to Val-122; and Val-147 
co Thr-180 . 



SP070 

Lys-6 to Tyr-16; Gln-19 to Ile-27; Arg-50 to Ala-58; Leu-112 to Val- 
128; Ile-151 to Asn-167; Leu-305 co Phe-321. 



SP071 

Gln-92 to Asr.-158; Gln-171 to Gln-i88; Val-204 co Val-240; Thr-247 to 
Ala-273; Glu-279 co Thr-338; Pro-345 to Glu-353; Asn-483 to Lys-539; 
Val-552 co Ala-568; Glu-575 co Ser-591; Ser-621 co Gly-640; Gln-742 
to Gly-753. 



SP072 

Vai-53 co Tyr-81; Tyr-35 co Val-121; Leu-127 co Gly-140; Gly-144 to 
Ala-155; Gln-168 co Val-135; Asp-210 co Try-241; Glu-246 Co Thr-269; 
Lys-275 co Tyr-295; Gly-303 to Pro-320; Arg-327 to Ile-335; Thr-338 
co Thr-364; Tyr-473 to ?he-495; and Tyr-499 to Arg-521 . 

SP073 

Glu-37 co Val-45; Glu-55 to Val-63; Thr-104 to Thr-119; Ile-127 to 
Tyr-135; Asn-220 to Ile-222; Thr-237 to Ala-250; Ser-253 Co Ala-263; 
Glu-234 co Ile-297; and Mec-438 co Asn-455. 



SP074 

Gly-2 co Ala-12; Gly-96 to Ile-110; and Thr-220 to Phe-239. 
SP075 

?he-33 to Tyr-42; Gln-93 to Gly-102; and Val-196 to Asp-211. 



SP076 

Ser-64 to Leu-76; and ?he-31 to Ala-101. 



SP077 

Asp-1 to Glu-12; Tyr-26 to Val-36; and Val-51 to Try^62. 
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SP078 

Ala-193 to Ile-208; Tyr-266 to Asn-275; Glu-356 to Leu-369; Ala-411 
to Gly-422; Ser-437 to Pro-464; Thr-492. to Glu-534; and Glu-571 to 
Gln-508. 

SP079 

Gly-11 to Leu-20; Lys-39 to Leu-4B; Leu-72 to Val-85; Asn-147 to Ser- 
158; Ile-178 to Asp-187; Tyr-139 to Gln-201; and Leu-203 to Ala-216 

SPO80 

Ser-2 to Glu-12; Gln-42 to Ala-51; Ala-116 to Ser-127; Phe-131 to 
Asp-143; and Ile-159 to Ile-171. 

SP08 1 

Gln-2 to Leu-9; Gln-49 to Cys-57; Ile-108 to Val-131; Gly-134 to Leu- 
145; and Trp-154 to Cys-162. 

SP082 

Ile-101 to Ser-187; Gly-191 to Asn-221; Arg-225 to Arg-236; Tyr-239 
to Leu-255; and Gly-259 to Arg-2 63 . 

SP083 

Ser-28 to Asp- 7,0 . 
SPOS4 

Leu-42 to Glr.-66; Thr-59 to Lys-81; Glu-83 to Arg-92; and Gly-98 to 
Asn-110 . 

SPOS5 

Gln-2 to Vai-22; and Ser-45 to Giu-51. 
SP08 6 

Leu-18 to Gln-65; and Lys-72 to Val-83 . 



SP087 

Ser-45 to Leu-53; and Thr-55 to Gln-63 
5P088 

Pro-8 to Iie-16; Leu-25 to Trp-33; Tyr-35 to Gln-43; Leu-51 to Val-59; 
Val-59 to Arg-67; Thr-55 to Tyr-63; Asn-85 to Gly-93; Thr-107 to 
Leu-115; 

Leu-115 to Trp-123; Ala-121 to Thr-129; Tyr-153 to Ala-161; His-176 to 
Gly-184; Tyr-194 to Ala-202; Ala-217 to Giy-225; and Asn-85 to Gly-93. 

SP08 9 

Trp-43 to Ala-51; Gln-68 to Phe-76; Val-93 to Gln-101; Phe-106 to 
Phe-114; Lys-117 to Lys-125; Trp-148 to Phe-156; Glu-168 to Gln-176; 
Ile-193 to Tyr-201; Lys-203 to Lys-211; Glu-212 to Gln-220; Ile-237 to 
Tyr-245; Lys-247 to Lys-255; Glu-256 to Gln-2 64; Met-275 to Gly-283; 
Lys-286 to Gly-294; Trp-292 to Glu-300; Asp-289 to Thr-297; Tyr-315 to 
Ser-323; Asp-334 to Ly's-342; Pro-371 to Arg-379; Arg-485 to Asn-493; 
Lys-527 to Arg-535; Phe-537 to Met-545; and Tyr-549 to Glu-557. 

SP090 
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Phe-2 to Gln-iO; Gin- 13 co Lys-2I; Tyr-19 co Glu-27; Tyr-39 to Met-47; 
Pro-65 to Leu-73; Tyr-121 to His-129; Lys-147 to Ile-155; Giy-151 to 
Lys-169; Gly-218 to Trp-226; Asp-230 to Thr-238; Tyr-249 to Ala-257; 
and Ala-272 to Gly-280. 

SP091 

Ser-19 to Ser-27; Asn-25 to Thr-33; Val-51 to Gln-59; Asn-75 to Asn-83; 
Ile-103 to Trp-111; Tyr-113 to Ala-121; Leu-175 to Asn-183; Glu-185 to 
Trp-193; Ala-203 to Tyr-211; Val-250 to Phe-258; Asn-250 to Thr-268; 
5er-278 to Asp-286; Tyr-305 to Leu-313; Asn-316 to Gly-324; Asn-374 to 
Asp-332; Asn-441 to Gly-449; and Ser-454 to Gln-462. 

5P092 

Arg-95 to Glu-103; Ala-215 to VaI-224; Leu-338 to Glu-345; ?ro-350 to 

Ala-358; Pro-359 to Ala-367; ?ro-368 to Ala-376; Pro-377 to Ala-385; 

?ro-386 to Aia-394; Pro-395 to Ala-403; Pro- 3 50 to Ala-358 ; Gln-414 to 

Lys-422; Pro-421 to Asn-429; Trp-465 to Tyr-473; Phe-487 to Tyr-495; 

Asn-517 co Gly-525; Trp-536 to Tyr -594; Phe-508 to Tyr -615; and Asp-630 
to. Gly-638 . 

SP093 

Gln-30 to Iie-38; Gln-52 to Val-50; Ala-108 to His-116; Tyr-133 to 
Glu-141; Tyr- 192 to Ala-200; and Phe-207 to Ser-215. 

SP094 

Ala-87 to Vai-95; Leu-110 to Cys-118; Gln-133 to Leu-141; Ser-135 to 

Leu-193; Ile-195 to Giy-203; Asp-206 to Gln-214; Ser-211 to Gly-219; 
Ile-241 to Thr-249. 

SP095 

Arg-1 to Gln-9; Phe-7 to Asn-15; Thr-21 to Asn-30; Leu-46 to Phe-54; 
and Ser-72 to Met-80. 

5P09 6 

Giy-29 to lie- 37 ; Glu-52 to Ser-60 ; and Leu- 64 to Gly-72 . 
SP097 

Ala-11 to Thr-19; GIu-53 to Glu-61; Ser-91 to Lys-99; Thr-123 to 
Gln-131; and Gly-209 to Lys-217. 

SP098 

Thr-3 to Ser-11; Gly-38 to ?he-46; Tyr-175 to Asn-183; Met-137 to 
Cys-195; Gln-197 to Leu-205; Tyr-307 to Gln-315; Gly-318 to Tyr-326; 
Asn-348 to Val-356; Lys-377 to ?ro-385; and Leu-415 to Val-423. 

SP099 

Arg-19 to Gly-27; Asp-76 to Ser-84; Val-90 to Lys-98; Phe-165 to 
Val-173; Leu-237 to Pro-245. 

SP100 

His-111 to Gln-119; Ser-141 to His-149; Asp-154 to Ser-162; Gln-158 

to Gln-166; Asp-154 to Gln-166; Lys-180 to Gln-188; and Ser-206 to 
Gln-214. 

SP101 
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Glu-23 to Glu-31; Glu-40 to Val-48; Gln-50 to Ser-53; Thr-61 to 
Ile-69; Leu-82 to Ile-90; Ala-108 to Leu-116; Gln-121 to Pro-129; 
and Leu-130 to Thr-138. 

SP102 

Asp-32 to His-40; Arg-48 to Lys-56; and Asp-102 to Thr-110. 
SP103 

Arg-5 to Gln-13; Gln-22 to Leu-30; Arg-151 to Gln-159; Arg-167 to 
Gln-175; Pro-189 to Glu-197; Gly-207 to Leu-215; Ser-219 to Gln-227; 
Ser-233 to Ser-241; Pro-255 to Asp-264; Lys-272 to Gly-280; Ser-318 
to Val-326; Thr-341 to Asp-351; Asn-356 to Thr-364; Val-370 to 
Tyr-378; 

Iie-379 to Gln-387; and Met-435 co Tyr-443 . 



SP105 

Asn-28 to Pro- 3 5 ; Thr-77 to ?he-35; Arg-88 to Val-96; Gly-107 zo 

Phe-115; Asp-169 to Asp-177; His-248 to Ser-256; and Ser-274 to 
AXa-282 . 

SPX06 

Val-10 to Thr-A8; Ile-52 to Tyr-70; Ile-71 to Pro- 7 9 ; Lys-86 to 

Gln-94; Lys-100 to Thr-108; Phe-132 to Leu-140; and Asp-145 to 
Arg-153. 

SP107 

Asp-33 to Val-41; and Arg-63 to Gln-71. 
5P108 

Lys-9 to Gin-17; Leu-44 to- Ser-52; Ser-63 to Phe-71; Tyr-109 to 
Ser-117 ; Iie-i33 to Ile-191; Pro-194 zo Leu-202; Giy-257 lo Gln-265; 
Aia-323 to Thr-331; and Leu-381 to Tyr -389. 

SP109 

Asn-2 to Gln-10; Ala-65 to Lys-73; Leu-75 to GIu-84; Thr-111 to 
Asp-119; Gln-116 to Tyr-124; Tyr-130 to Vai-138; Asp-173 to Gly-181; 
Asp-196 to Ser-204; Asn-231 to Ser-239; Phe-252 to Ser-260; Phe-270 to 
Tyr-278; Val-291 to His-299; Asp-306 to Leu-314; and Pro-327 to 
Gly-335. 

SP110 

Ser-8 to Glu-16; Ile-37 to Val-45; Ala-107 to Val-115; and Gly-122 
to Thr-130. 

SP111 

Asp-19 to Giu-28; Leu-43 to Ala-51; Asn-102 to Phe-110; Gln-133 to 

Ser-141; Phe-162 to Asp-170; Tyr-194 to Met-202; and Asp-273 to 
Ser-281. 
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SP112 

Asp-3 co Gln-11; Gly-21 to Ile-29; Ala-46 to Arg-54; Arg-98 to 
Arg-106; Thx-114 to Val-122; Gln-i33 to Asn-141; and Leu-223 to 
Thr-231 . 

SP113 

Asn-19 to Gly-27; Arg-54 to Ser-62; Val-69 to Gln-77; Ser-117 to 
Asn-125; Gly-164 to Leu-172; Tyr-193 to Ser-201; Cys-303 to Phe-31i; 
His-315 to Ile-323; Arg-341 to Cys-349; Ile-347 to Ser-355; Arg-403 
to Phe-411; Gln-484 to Pro-492; Ser-499 to Leu-507; Ile-541 to 
Thr-549 

Asn-622 to Ile-630; and Glu-645 to Gly-653. 
SP114 

Gly-17 to Leu-25; His-40 to Gin-48; Arg-49 to Arg-57; Xle-55 to . 
Pro-73; 

Asn-101 to Asp-lII; Gly-128 to Cys-136; Phe-183 to Thr-191; and 
Pro-268 to Ile-276. 

SP115 

Met-8 to Ser-16; Tyr-24 to Leu-32; Cys-68 to Leu-76; Ser-100 to 
Pro-108 ; Thr-i93 to Thr-201; Gly-238 to Pro-250; Thr-280 co Phe-288; 
Pro-303 co Asrv-312; Trp-319 to ^eu-328; Leu-335 to Leu-344; Lys-395 
to Ala-403; Asn-416 to Gln-424; Tyr-430 to Ser-438; Val-448 to 
Leu-456; Leu-460 to Thr-468; Pro-502 to Thr-510; Lys-515 to 
Ile-524; Gln-523 to His-532; Tyr-535 to Thr-543; Ser-559 to 
Pro-567; Thr-572 to Asn-580; 

Val-594 to Arg-602; Arg-603 to Asn-611; Thr-620 to Trp-628; and 
Tyr-644 to Arg-653. 

SP117 

Ala-6 to Gly-14; Ile-19 to Thr-27; Thr-99 to Leu-107; 5er-117 to 

Asp- 12 5 ; Kis-131 to Val-139; Ile-193 to Gly-201; and Val-241 co 
Gln-249 . 

SP118 

Ser-8 to Trp-23; His-46 to Ala-54; Asn-93 to Gly-101; Val-ICO to 
Ser-108; Arg-155 to Asp-163; and His-192 to Leu-200. 

SP119 

Tyr-46 to Lys-54; Ser-93 to Ser-101; Trp-108 to Asn-116; Val-121 to 
Glu-129; and Tyr-131 to Gln-139. 

SP120 

Ala-57 co Lys-65; Leu-68 to Glu-76; Thr-103 to Tyr-116; Tyr-122 to 
Val-130; His-163 to Gly-173; Asp-138 to Ser-196; Ser-222 to Ser-231; 
Phe-244 to Ser-252; Pro-262 to Tyr-270; Val-283 to His-291; and 
Asp-298 to Leu-306. 

SP121 

Ser-3 to Ala-U; Asp-13 to Leu-21; Ser-36 to Val-44; and Gln-136 to 
Met-144 . 

SPX22 

Asn-28 co Lys-36; Glu-39 to Thr-50; Val-54 to Lys-62; Asn-106 to 
Leu-114; Phe-159 to Gly-157; Asn-172 to Arg-180; Glu-199 to Asn-207; 
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Lys-230 to His-241; Asn-252 to Gly-263; Met-278 to Ala-2S7; Thr-346 
to Asp-354; Lys-362 to Thr-370; Asp-392 to Asn-405; Asp-411 to 
Ala-424; Gly-434 to Gly-443; Tyr-484 to Glu-492; Ile-511 to 
Leu-519; Asn-524 to Asp-538; Glu-552 to Ile-567; Val-605 to Lys-613; 
Phe-697 to Aia-705; ?he-722 to Leu-730; Leu-753 to Leu-761; Asp-787 
to Gln-795; Leu-858 to Asn-866; Ala-892 to Thr-901; Gly-903 to 
Ile-913; Ile-921 to Asn-931; Asn-938 to Pro-951; Gly-960 to 
Lys-970; Leu-977 to Asp-985; and Leu-988 to Pro-996. 

SP123 

Val-4 to Asn-12; Glu-47 to Leu-55 ; Lys-89 to Glu-100; Ser-155 to 
Thr-173; Lys-234 to Val-242; Ser-258 to Ser-266; Glu-284 to Asn-292; 
Tyr-327 to Leu-335; Tyr-457 to Thr-465; Tyr-493 to Glu-501; Thr-506 
co Tyr-514; Lys-517 to Thr-525; Asn-532 to Gly-540; and Arg-556 to 
Glu-564. 

SF124 

rg-16 to Glu-24; Gln-52 to Arg-60; Asn-69 to Tyr-77; Glu-121 to 

Asn-129; Ala-134 to Val-142; Thr-151 to Ala-159; Asn-164 to Glu-172; 

His-181 to His-189; Thr-210 to Ala-218; Ser-244 to Val-252; Phe-287 

to Tyr-297; Ser-312 to Thr-323; His-433 to Tyr-441; Ser-445 to 
Asn-453; 

Asn-469 to Thr^477; Asn-501 to Asn-509; Gln-536 to Ala-547; and 
Gln-608 to Asp-621. 

SP125 

Ser-9 to Asp-21; Ala-28 to Leu-36; Asn-49 to Phe-57; Val-137 to 

Arg-145; Asn-155 to Leu-163; Glu-183 to Asp-191; Gly-202 to Tyr-210; 

Pro-221 to Asp-229; Phe-263 to Ala-27i; Phe-300 to Gln-309; Asp-313 

to Glu-321; Asn-324 to Asp-332; IIe-346 to Asn-354; Asp-352 to 

Lys-370; Met-402 to Gly-410; Gly-437 to Gly-445; Ser-471 to Glu-483; 

Gly-529 to Asp-537; Gln-555 to Vai-563; and Leu-579 to Lys-537 . 

SP126 

Leu- 2 2 to Thr-30; Val-65 to Leu- 73; and Thr-7 5 to Asp-83. 
SP127 

Glu-2 to Ala-12; Asp-28 to Thr-36; Val-105 to Thr-113; Lys-121 to 
Thr-129; Trp-138 to Pro-146; Ser-152 to Ile-160; Lys-180 to Asp-188; 
Leu-194 to Asn-202; and Gly-228 to Thr-236. 
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Printr 










HW e 


SEO ID 


A ml AHC A 


RE 

oil 


enftfti * 

brUUlA 


NO : 


22 t 


r* r cvrzc ^ rpi r*r % T , i a a aTTTar'^iri a til a a & ATP 


Ram HT 


drUUlo 


NO : 


1 O Q 

22o 


L, l\9AO 1 v- uA^. 1 otj i lui 1 uy 1 lunVj 


Sal I 


SFUU4A 


NO : 


•jog 


G 1%. AGNAIL. l> AAA 1 IAL.AA1 AUuuAL 1 Alvj 


Sam UT 


SPQ04B 


NO : 


lift 
230 


CAGTGTCGAu I AAL It- 1 AWj ivWjAAAL 




SP006A 


NO : 


231 


GACTGGATCCTGAGAAiL.AAtj\- lAUAL.LUAAAwWj 




^ n A ^ n 

SP006B 


NO: 


232 


AGTCAAGCTTTTGTAACTGAGATTGATCTGG 


Hind III 


SP007A 


NO: 


233 


G ACTGG ATC CTGGT AACCGCTC TTCTCGT AACGC AGC 


Bajn HI 


SP007B 


NO: 


234 


AGTC AAGC TTTTTC AGG AACTTTT ACGC TTC C 


Hind III 


SP008A 


NO: 


235 


AGTCAGATCTTGTGGaAATTTGACAGGTAACAGCAAAAAAGCTGC 


Sgl II 


SP008B 


NO: 


236 


ACTGAAGCTTTTTTGTTTTTCAAGaATTCATCG 


Kind III 


SP009A 


NO: 


237 


GACTGGATCCTGGTCAAGGAACTGCTTCTAAAGAC 


Sa/n HI 


SP009B 


NO: 


238 


agtcaagctttcacaaattcgttggtgaagcc 


Hind III 


SP010A 


NO: 


239 


GACTGGATCCTAGOTCAGGTGGAAACGCTGGTTCATCC 


Sain HI 


SP010B 


NO: 


240 


AGTCAAGCTTATCAACTTTTCCACCTTCAACAACC 


Hind III 


SP011A 


NO: 


241 


GTCAAGATCTCTCCaACTATGGTAAATCTGCGGATGG 


Bgl II 


SP011B 


NO: 


242 


AGTCCTGCAGATCCACATCCGCTTTCATCGGGTTAaAGAAGG 


Pst I 


SP012A 


NO: 


243 


GACTGGATCCTGGGAAAAATTCTAGCGAAACTAGTGG 


Sam HI 


SP0123 


NO: 


244 


gtcactgcagctgtccttcttttacttctttggttgc 


Psc I 


SP013A 


NO: 


245 


gactggatcctgctagcggaaaaaaagatacaacttctgg 


Bam HI 


SP013B 


NO: 


246 


ctgaaagcttttttgccaatccttcagcaatcttgtc 


Hind III 


SP014A 


NO: 


247 


gactagatcttggctcaaaaaatacagcttcaagtcc 


Bgl II 


SP014B 


NO: 


248 /- 


agtcctgcaggtttttgtttgcttggtattggtcg 


Psc I 


SP015A 


NO: 


249 


gactggatcctagtacaaactcaagcactagtcagacagag 


Bam HI 


SP015B 


NO: 


250 


cagtctgcagtttcaaagctttttgtatgtcttc 


Psc I 


SP016A 


NO: 


251 


GACTGGATCCTGGCaATTCTGGCGGAAGTAaAGATGC 


Bain HI 


SP016B 


NO: 


252 


AGTC AAGC TTGTTTC AT AGCTTTTTTGATTGTTTCG 


Hind III 


SP017A 


NO: 


253 


GACTGGATCCTTCACAAGaaaAAACaAaAAATGaAGATGG 


Bajn HI 


SP017B 


NO ; 


: 254 


AGTCaAGCTTATCGACGTAGTCTCCGCCTTC 


Hind III 


SP019A 


NO: 


: 255 


GACTGGATCCGaaAGGTCTGTGGTCaAATAATCTTACC 


Bain HI 


SP0193 


NO: 


: 256 


AGTCAAGCT7AGAGTTAACATGGTGCTTGCCAATAGG 


nl/lC ill 


SP020A 


NO : 


: 257 


GACTGGATCCAAACTCAGAAaAGAAAGCAGACaATGC 


cam hi 


SP020B 


NO : 


: 253 


AGTCAAGCTTCCAAACTGGTTGATCCAaACCATCTG 


IT V n>? T T T 

nlilQ 111 


SP02 1A 


NO : 


: 2 59 


GACTGG/1 iCC. TCGnAAGGvj j. CA(jAA<jLj i tj^AbALL 


Dam HT 


SP021B 


NO 


: 2 60 


AGTC AAGC T . CTGTAGGL 1 lGGio HjLLLLA<j * iyt 


Hind III 


SP022A 


NO 


: 2 51 


CTGAGGATCCGGGGA IOolAOL * - 1 1 AAAAA1L 


Bain HI 


srUi 2:5 


NO 


: 2 02 


LAu i AAuL . . I i iALLUn 1 1 UrtV-V-M 1 1 


Hind III 


SP02 3 A 


NO 


: 2 63 


(.Au I WjA - LLAuALuAIjLAAAAAA 1 1 


Bam HI 




iNU 


. "5 £ A 
: 2 04 




Hind III 


eon oe» 
SPO 2 DA 


m r\ 
NU 


: 2 o O 




Bam HI 


brUi jo 


NO 


: 2 oo 


L 1 uAu 1 LuALAAl Al iL 1 AuuAA 1 1 1 uunn i i 1VJ 


Sal 1 


com o & 
irUi OA 


NO 


: 2 o / 


L, 1 OALivjA X LLuAL * * i I AALAAlAAnnU 1 rt- l \Jrtr\\jriVJ 


Bajn HI 


con ODD 
o rU2 oo 


NO 


: 2 oo 


(j i L.AV- I LjUALtO i i o i LALL i UwAAAAn 1 UA\_vjV> 


Psc I 


brl) j OA 


NO 


:269 




Bam HI 


SPG JOB 


NO 


:270 


C AGTAAGC * * n C\j AAG 111 I LAuAA I iu 


Mind III 


SP031A 


NO 


:271 


GACTGGATCCCCAGGCTGATACAAGTATCGCA 


Bam HI 


SP0313 


NO 


:272 


CAGTAAGCTTATCTGCAGTATGGCTAGATGG 


Hind III 


SP032A 


NO 


:273 


G AC TGG ATC C G TC TGT ATC ATTTG AAAAC AAAG AAAC 


Bain HI 


SP032B 


NO 


:274 


CAGTCTGCAGTTTTACTGTTGCTGTGCTTGTG 


Psc I 


SP033A 


NO 


: 275 


ACTGAGATCTTGGTCaAAAGGaAAGTCAGACAGGAAAGG 


Bgl II 


SP033B 


NO 


:276 


CAGTAAGCTTATTCCTGAGCTTTTTTGATaAAGGTTGCGCA 


Hind III 


SP034A 


NO 


:277 


ACTGGGATCCGaAGGATAGATATATTTTAGCATTTGAGAC 


Bam HI 


SP0343 


NO 


:27S 


AGTCAAGCTTCCATGGTATCaAAGGCAAGACTTGG 


Hind III 


SP035A 


NO 


:279 


GTCAGGATCCGGTAGTTAAAGTTGGTATTaACGG 


Bam HI 


SP035B 


NO 


:280 


agtcaagcttgcaatttttgcgaagtattccaagag • 


Hind III 


SP036A 


NO 


:281 


AGTC GG AT CCTTCTTACGAGTTGGGACTGT ATC AAGC 


Bam HI 
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Seouence 


&£ 


SP036B 


NO: 


282 


AGTCAAGCTTGTTTATTTTTTCCTTACTTACAGATGAAGG 


Hind III 


SP038A 


NO:283 


AGTCGGATCCTACTGAGATGCATCATAATCTAGGAGC 


Sam HI 


SP038B 


NO: 


284 


TCAGCTCGAGTTCTTTGACATCTCCATCATAAGTCGC 


Xho I 


SP039A 


NO: 


235 


GACTGGATCCGGTTTTGAGAAAGTATTTGCAGGGG 


Bam HI 


SP039B 


NO: 


286 


CAGTAAGCTTGGATTTTTTCATGGATGCAATTTTTTTGG 


Hind III 


SP040A 


NO: 


287 


GACTGGATCCGACAACATTTACTATCCATACAGTAGAGTCAGC 


Bam HI 


SP040B 


NO: 


288 


GACTAAGCTTGGCATAAGGTTGCAATTCTGGATTAATTGG 


Hind III 


SP041A 


NO: 


289 


GACTGGATCCGGCTAAGGAAAGAGTGGATG 


Bam HI 


SP041B 


NO: 


290 


G ACT AAGCTTTTC ATTTTT AAATTG ACT ATGC GC C CG 


Hind III 


SP042A 


NO: 


291 


GACTGGATC CTTGTTC CT ATGAACTTGGTCGTC ACC 


Bam HI 


SP042B 


NO: 


292 


C ATG AAGCTT ATC C TGG ATTTTTC C AAGT AAATCT 


Hind III 


SP043A 


NO: 


293 


GACTGGATCCTTATAAGGGTGAATTAGAAAAAGG 


Bam HI 


SP043B 


NO: 


294 




Hind III 


SP044A 


NO: 


295 


GACTGGATC CG AATGTTC AGGC TC AAG AAAGTTC AGG 


Bam HI 


SP044B 


NO: 


296 


GACTAAGCTTTTCCCCTGATGGAGCAAAGTAATACC 


Hind III 


SP045A 


NO: 


297 


GACTGGATCCCTTGGGTGTAACCCATATCCAGCTCCTTCC 


Bam HI 


SP045B 


NO: 


298 


G ACTGTCG ACTT C AGC TTGTTT ATCTGGGGTTGC 


Sal I 


SP046A 


NO: 


299 


GACTGGATCCTAGTGATGGTACTTGGCAAGGAAAACAG 


3am HI 


SP046B 


NO: 


300 


ACTGCTGCAGATCTTTGCCACCTAGCTTCTCATTG 


Pst I 


SP048A 


NO: 


301 


GTCAGGATCCTGGGATTCAATATGTCAGAGATGATACTAG 


Bam HI 


SP0483 


NO: 


302 


CT AG AAG C TT AC GC AC C C ATTC AC C ATT ATC ATT G 


Hind III 


SP049A 


NO: 


303 / 


GTCAGGATCCGGATAATAGAGAAGCATTAAAAACC 


Bam HI 


SP049B 


NO: 


:304 


AGTC AAGC TTG AC AAAATC TTGAAACTCCTC TGG TC 


Hind III 


SP050A 


NO: 


305 


GTCAGGATCCAGATTTTGTCGAGGAGTGTCATACC 


Sam HI 


SP050B 


NO: 


:306 


AGTC AAG C TTTC C C TTTTT AC C CTT AC G AATC C AGG 


Hind III 


SP051A 


NO: 


:307 


GACTGGATC C ATC TGT AGTTT ATGCGG ATG AAAC AC TT ATT AC 


Bam HI 


SP051B 


NO: 


:308 


GACTGTCGACGCTTTGGTAGAGATAGAAGTCATG 


Sal I 


SP052A 


NO: 


:309 


GACTGGATCCTTACTTTGGTATCGTAGATACAGCCGGC 


Sam HI 


SP0523 


NO: 


:310 


AGTCAAGCTTTGTTAATTGCGTACCTTCTAAGCGACC 


Hind III 


SP053A 


NO: 


:311 


GACTGGATCCAGCTAAGGTTGCATGGGATGCGATTCG 


Sam HI 


SP053B 


NO: 


:312 


GACTGTCGACCTGGGCTTTATTAGTTTGACTAGC 


Sal I 


SP054A 


NO: 


:313 


CAGTGGATCCCTATCACTATGTAAATAAAGAGA 


Bam HI 


3P0545 


NO: 


:314 


ACTG AAGC TTTTCTG TCC CTGTTTG AGGC A 


Hind III 


SP055A 


NO: 


:315 


CAGTGGATCCTGAGACTCCTCAATCAATAACAAA 


Sam HI 


SP055B 


NO 


:316 


ACGT AAGCTT AT AATC AGT AGG AG AAAC TGAACT 


Hind III 


SP056A 


NO 


:317 


CAGTGGATCCGGATGCTCAAGAAACTGCGG 


Bam HI 


SP056B 


NO 


: 318 


GACTAAGCTTTTGCCTCTCATTCTTGCTTCC 


Hind III 


SP057A 


NO 


:319 


CAGTGGATCCCGACAAAGGTGAGACTGAG 


Bam HI 


SP057B 


NO 


:320 


ACGT AAGC TT ATT TCTTAATT C AAGTGTTTTC TCTG 


Hind III 


SP058A 


NO 


: 321 


GACTGGATCCAAATCAATTGGTAGCACAAGATCC 


Sam HI 


SP058B 


NO 


:322 


CAGTGTCGACATTAGGAGCCACTGGTCTC 


Sal I 


SP059A 


NO 


:323 


CAGTGGATCCCAAACAGTCAGCTTCAGGAAC 


Sam HI 


SP0593 


NO 


:324 


G ACTCTGC AGTTT AATCTTGTCCCAGGTGG 


Pst I 


SP060A 


NO 


:325 


G ACTGG ATC C ATTC G ATG ATGC GG ATG AAAAG 


Bam HI 


SP060B 


NO 


:326 


GACTAAGCTTCATTTGTCTTTGGGTATTTCGCA 


Hind III 


SP062A 


NO 


:327 


CAGTGGATCCGGAGAGTCGATCAAAAGTAG 


Bam HI 


SP062B 


NO 


:328 


GTCACTGCAGTTGCTCGTCTCGAGGTTC 


Pst I 


SP063A 


NO 


: 229 


C AGTGG ATC C A TGG AC AAC AGG AAAC TGGG AC 


Bam HI 


SP063B 


NO 


:330 


CAGTAAGCTTATTAGCTTCTGTACCTGTGTTTG 


Hind III 


SP064A 


NO 


:331 


GACTGGATCCCGATGGGCTCAATCCAACCCCAGGTCAAGTC 


Sam HI 


SP064B 


NO 


: 332 


GACTCTGCAGCATAGCTTTATCCTCTGACATCATCGTATC 


Pst I 


SP065A 


NO 


: 333 


GACTGGATCCTTCCAATCAAAAACAGGCAGATGG 


Bam HI 


SP065B 


NO 


: 334 


GACTAAGCTTGAGTCCCATAGTCCAAGGCA 


Hind III 


SP067A 


NO 


: 335 


AGTCGGATCCTATCACAGGATCGAACGGTAAGACAACC 


Sam HI 


SP0673 


NO 


:336 


ACTGGTCGACTTCTTTTAACTCCGCTACTGTGTC 


Sal I 
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Primer 










Name 


SEO ID 


Seauence 


RE 


5P068A 




j J ' 


CAGTGGATCCAAGTTCATCGAAGATGGTTGGGAAGTCC 


Bam HI 




MO • 


J JO 


GATCGTCGAC CCGCTC CC AC ATGCTC AAC CTT 


Sal I 


SP069A 


MP"l * 


7 "t 0 


TGACGGATC C ATCGCTAGCTAGTG AAATGC AAG AAAG 


Bam HI 


SP069B 


IML/ . 




TGACAAGCTTATTCGTTTTTGAACTAGTTGCTTTCGT 


Hind III 


SP070A 


Ktr\ . 
NU : 




GACTGGATCCGCACCAGATGGGGCACAAGGTTCAGGG 


Bam HI 


Of U / UD 


NO : 


34^ 


TG A P A AGP T m ft ft PTTGT A A PG A A P AGTTP AATPTG 
I L» AL AAuL. I a AAL, l i u I nnfJnnLnu i i >»/vi ivivj 


Hind III 


COH71 A 
dru / AA 


NO: 


343 


GAPTAGA^P" T ** T *T J T*TAAPPPAAPTGTTGGTACT y rTPP 
uAw 1 AUA a \- * * 1 1 1 AALV-L AAL. lull u\J i rtv- ill ww 


Bai TT 
•°y x * — 


or U /la 


NO : 


3 44 


1 OnLAnuL 1 >Ol 1 ALjVj i w 1 1 ALA 1 » 1 X On\->-VJ 1 V* 


Hind III 


com T a 


NO : 


345 


arTriri*rr" wl * , 'T ,, T>Ti arm a p , Tr*TTPir:T a PTTTP 

AL 1 LjALjA 1L. i 1 . 1 mIIwaaL lull uu I ftv. i i I w 


Bg"! II 


SrU / ZD 


NO : 


346 


LAL 1 AAbL I I * v 1 ACOA 1 AAL LA 1 LA .liilLili aLV. 


Hind III 


SP073A 


NO: 


347 


^ar^rv^Tor* arTVPT af at attta i/^Tr^T 1 ! aptpa AGPG 
LAL l\a 1 L L AL i \- L i AuA 1 A 1 1 1 AAu IV . nAy 1 UAnUV-u 


Sal I 


SP07 3B 


NO: 


348 


AL1LAALL I Ivja 1 Abu 1 u i 1ALA1 I i ILLAALIL 


Hind III 


SP074A 


NO: 


349 


p* a r"Tcr* ^ tp^^'^^tplpittttp: a ari^a tr. fT, i ar; 
uAL 1 LLA 1 LLL * i 1 uu 1111 uAnuuAnu t. nAu 


Sam HI 


SPQ74B 


NO : 


350 


1 \j Av» v» I VjV- AUAC A 1111 loAAAAA i, u\jA^aj iuIAII 




SP075A 


NO: 


351 


LAb lL»taATL»\-C^. At. i AL.L. 1a_ 1HjAL3AvjaAAVj 




SP075B 


NO : 


352 


ALTuAAbC T. T TLuL 1 i 11 1 AL. 1L,vj x i - uaLA 




SP07 6A 


NO: 


353 


CAu 1 vjvjA ILL 1 AALiL) 1 LAAAAL7 1 LAL» ALLLjL I AnwnAAu 1 Vj*- 


33iTl HI 


SP07 6B 


NO : 


354 


L ALt 1 AALjL i 1 t ALjLtLj 1 A ILL AAA i AL 1 Ljvj 1 1 i L»A 1 U 


Hind III 


SP077A 


NO: 


355 


TGACAG ATlTTGALGL^TlTlAL)L»A 1 L ALjAL 1 LALtLj 




SP077B 


NO: 


356 


TGACaAGCTTCaAAGAlATCCAClTCTTGALL 4. . lu 




SP07 8A 


NO : 


357 


/-» 71 r"vr\r* n tpptip iP , r*p*"n*T v Tv , p*r s i a. & TPimTTHPin ^ n nrzr: 

uAL l LtLtA I LL 1 AL» AL»L»L 1 1 ITjLLAAA 1 L7L7 1 ubonnuwu 


Sa/n HI 


f~ n T o "3 

SP07 83 


NO: 


358 r 


GTLALj ILUAL iTGTTLj i AALAL i 1 1 iLLsrtLjLjI - *L»VjIA^^. 


J T 

Jai — 


r* "7 O > 

SP079A 


NO: 


359 


LAGTGLjATLLTL AAAAALjALAAULjAAAAL 1 1 LjLj 


3a/n KI 


S?0 / ?B 


NO: 


360 


PRPTP^PT » ^rnrnjj\r^rr>rpr> n IPi A * r , rT n, P.T r * , PTTP. 

L AL 1L1 LtL AL I 1 1 L 1 1 LAAL AAALL 1 *L1 ILj 


Psc I 


bPU o UA 


NO: 


361 


PHP1W , lTPPiPPT v rPTl'P'TPAP^ , aPPlP' T, T 
LALIvjLAILLALLI 1 L 1 Al iLALHjALLAL * 1 


3am HI 


n O rt o 

SP08QB 


NO: 


362 


L AGTAAOCTTTTLLTTLlLAti 1 L AA 1 iL*-liLL 


Hi nd III 


SP08 1A 


NO: 


363 


^ a /^*nv*'^ a TPrrrpTP a a a aTAff^af' Af*f""TY^T , PP*lPi 
LALTGLATLLLlL 1 LAAAA 1 ALLALALL 1 L 1 1 LAL? 


Sa/n HI 


5P0B IB 


NO: 


;364 


GALTAALjL 1 1 AL TALLA 1 LLL 1 L 1\jAL ALvj 1 1 *ajAA 


Hind III 


iPUozA 


NO: 


;365 


p"Tv*a^r*a p pr'p* a sttptjpi a 'PT 1 a f~" a a a a An AT 1 A fir* 
LTLALLA 1 LL AA 1 1 L 1 AL AA 1 i AuAAAAAuA 1 AVj^ 


Sam HI 


CDrt QTO 


NO: 


;366 


TPapaafiPT' T YV , p.TTr:iPTAfv^T'T , p'rp,paATGCC 


Hind III 


CDPi QTll 

iru o Jn 


NO: 


;367 


oAL 1 bun ILL XL 1 unLLAAuLniwVftUftnuLrtu A 1 


Sam HI 




NO; 


:368 


TP* A(^P 1P.PTP, ATP ATTVtAPTTTAPP,AT mfn Gr , TCC 
1 LAuLnuL i Li A 1 LA 1 i LjAL. Ill AV_L>A a . - U*w 


SgJ II 


j r U o *t A 


NO: 


:369 


CI APTPrt ATPPnTPPP,P,PTPTPTPP AHTPC \C' r%r r >rnt T r ?C \GCG 


3am HI 


COO Q/1Q 


NO; 


:370 


TPanAif;PTTaT^ mrn TTP,TTTPPTTAA^r i CGT ,p 


Hind III 


S?0 8 5 A 


NO; 


:371 


r: AP m P*n A TP PP»GG AC AAA TTP AAAAAAAT AGGC AAG AGG 


Bam HI 


con Q^B 


NO; 


:372 


PTP A A AP-PTTTP^PTPTTTP.ATTnPP A AC AAPTG 
Lj lLAAAUL 1 1 iuOL iL i 1 1L>A i lu^LnnvAAV. i u 


Hind III 


epfl Q C A 
JfU O D A 


NO; 


:373 


PAPTPPATPPTPPPTAPPAP^PAAPAAAGCGAGCAAAAGG 


Sam HI 


SP086B 

■j r v o a o 


NO 


:374 


GACT AJVGCT^ACTTTTTTC' n TTTTCCACACGA 


Hind III 


SP087A 


NO:375 


PAGTGGATCCGAACCGACAAGTCGCCCACTATCAAGACT 


Sam HI 


SP087B 


NO 


:376 


PTG AJUVGCT^TG AATTC^CTTTCTTTTC AGGC T 


Hind III 


d t*w oon 


NO 


:377 


TPG AGG ATPPGGTTGTPGGPTGGC AAT ATATCCCGT 


Sam HI 


jruuuu 


NO 


:378 


CAGTAAGCTTCCGAACCCATTCGCCATTATAGTTGAC 


Hind III 


£P0 89 A * 


NO 


:379 


A GTPGG A TPPGGPP AAA TP AG A ATGGGTAGAAGAC 


Sam HI 


cpfi QQQ 


NO 


:380 


TP APPTGPAGPTTPTPATTGATTTTp ATC ATCAC 


Psc I 


SP090A 


NO 


:381 


G ACTGG ATC C AT/ITGC AG ATG ATTC TG AAGG ATGG 


Sam HI 


SP090B 


NO 


:382 


TC AGC TGC AGCTTAAC C C ATTC ACC ATTC T AGTTT AAG 


Psc I 


SP091A 


NO 


:383 


GAL^L^ATCCTGTCGCTGCAAATGAAACTGAAGTAGC 


Sam HI 


SP091B 


NO 


:384 


GACTAAGCTTATACCAAACGCTGACATCTACGCG 


Hind III 


SP092A 


NO 


:385 


AGTC AG ATCTT ACGTC TC ACKTC T AC TTTTGT AAG AGC 


Sgi II 


SP092B 


NO 


:386 


GACTAAGCTTAACCCATTCACCATTGLjCATTGAC 


Hind III 


SP093A 


NO 


:387 


CAGTGGATCCTGGACAl^TGAaAGGTCATGCTACATTTGTG 


Sam HI 


SP093B 


NO 


:388 


GACTAACK:TTCAACCATTGAG^CCTTGCa.ACAC 


Hind III 


SP094A 


NO 


:389 


GTCAC^ATCCGATTGCTCCTTTGAAGGATTTGAGAGAaACC 


Bam HI 


SP094B 


NO 


:390 


GACTAaGCTTCGATCAaAGATAAGATAAATATATATAAAGT 


Hind III 


SP095A 


NO 


:391 


GACTOTATCCTAGGTCATATl^ACTTTTTTTCTACAACAAaATAGG Bain HI 
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Hfifflfi 






&£ 


SP095B 


NO: 


392 


TGACAAGCTTATCTATCAGCTCATTTAATCGTTTTTG 


Hind III 


SP096A 


NO: 


393 


CTGAGGATCCCAACGTTGAGAATTATTTGCGAATG 


Bam HI 


SP096B 


NO: 


394 


TGACAAGCTTGAGTCTACAAAAGTAATGTAC 


Hind III 


SP097A 


NO: 


395 


GTCAGGATCCCTACTATCAATCAAGTTCTTCAGCC 


3am HI 


SP097B 


NO:396 


TGACAAGCTTGACTGAGGCTTGGACCAGATTGAAAAG 


Hind III 


SP098A 


NO: 


397 


GACTGGATCCGACAAAAACATTAAAACGTCCTGAGG 


Bam HI 


SP098B 


NO: 


398 


GACTAAGCTTAGCACGAACTGTGACGCTGGTTCC 


Hind II I 


SP099A 


NO: 


399 


GACTGGATCCTTCTCAGGAGACCTTTAAAAATATC 


Bam HI 


SP099B 


NO: 


400 


GACTAAGCTTGTTGGCCATCTTGTACATACC 


Hind III 


SP100A 


NO: 


401 


GACTGGATCCAGTAAATGCGCAATCAAATTC 


Bam HI 


SP100B 


NO: 


402 


AGTCCTGC AGG T ATTT AGC CC AAT AATCT AT AAAGCT 


Pst I 


SP101A 


NO: 


403 


C AGTGG ATC C TT AC CGC GTTC ATC AAG ATGTC 


Bam HI 


SP101B 


NO: 


404 


GACTAAGCTTGCCAGATGTTGAAAAGAGAGTG 


Hind III 


SP102A 


NO: 


405 


GACTGGATCCGTGGATGGGCTTTAACTATCTTCGTATTCG 


Bam HI 


SP102B 


NO: 


406 


AGTCAAGCTTGCTAGTCTTCACTTTCCCTTTCC 


Hind III 


SP103A 


NO: 


407 


GACTGTCGACACTAAACCAGCATCGTTCGCAGGA 


Sal I 


SP103B 


NO: 


408 


CTGACTGCAGCTTCTTGAAGAAATAATGATTGTGG 


Pst I 


SP105A 


NO: 


409 


C AGTGG ATCCTGACTACCTTGAAATCCCACTT 


3am HI 


SP105B 


NO: 


410 


C AGT AAGCTTTTTTTT AAGGTTGT AG AATG A TTTC AATC 


Hind III 


SP106A 


NO: 


411 


CAGTGTCGACTCGTATCTTTTTTTGGAGCAATGTT 


Sal I 


SP106B 


NO: 


412 


GACTAAGCTTAAATGTTCCGATACGGGTGATTG 


Hind III 


SP107A 


NO: 


413 ✓ 


CAGTGGATCCGGACtCTCTCAAAGATGTGAAAG 


Bam HI 


SP1073 


NO: 


414 


G AC T AAG C TTC TTG AG TTTGTC AAGG ATTGC TTT 


Hind III 


SP108A 


NO: 


415 


CAGTGGATCCCAAGAAATCCTATCATCTCTTCCAGAAG 


Bam HI 


SP108B 


NO: 


416 


G AC T AAGCTTTTC AG AACT AAAAGCC GC AGCTT 


Hind III 


SP109A 


NO: 


417 


GACTGGATCCACGAAATGCAGGGCAGACAG 


Bam HI 


SP109B 


NO: 


418 


C AGT AAGCTT ATC AAC AT AATCT AGT AAATAAGCGT 


Hind III 


SP110A 


NO: 


419 


C AGTGG ATC CTGT AT AGTTTTT AGC GCTTGTTC TTC 


Bam HI 


SP110B 


NO: 


:420 


GTCAAAGCTTTGATAGAGTGTCATAATCTTCTTTAG 


Hind III 


SP111A 


NO: 


:421 


GACTGGATCCGTGTGTCGAGCATATTCTGAAG 


Bam HI 


SP1113 


NO: 


:422 


CAGTAAGCTTACTTTTACCATTTCTTTGTTCTGCATC 


Hind III 


SP112A 


NO: 


:423 


GACTGTCGACGTGTTTGGATAGCATTCAGAATCAGACG 


Sal I 


SP1123 


NO: 


:424 


CAGTAAGCTTCGGAAGTAAAGACAATTTTTCC 


Hind III 


SP113A 


NO: 


:425 


CAGTGGATCCGTGCCTAGATAGTATTATTACTCAAAC 


Bam HI 


SPU3B 


NO: 


:426 


GACTAAGCTTTTTGCTTATTTCTCTCAATTTTTC 


Hind III 


SP114A 


NO 


:427 


C AGTGG ATC C C ATTC AG AAGC AG AC C T ATC AAAAT C 


Bam HI 


SP114B 


NO 


:428 


AC TGAAGC TT ATG T AATTTTTT AG ATT TTTC AAT ATTTTT C AG 


Hind III 


SP115A 


NO 


:429 


AGTCGGATCCTAAGGCTGATAATCGTGTTCAAATG 


Bam HI 


SP115B 


NO 


:430 


GAC T AAG C TT AAAATT AG AT AG AC GTTG AGT 


Hind III 


SP117A 


NO 


:431 


AGTCGGATCCCTGTGGCAATCAGTCAGCTGCTTCC 


Bam HI 


SP1173 


NO 


:432 


G ACTGTCG AC TTT AATC TTGTC C C AGGTGGTT AATTTG CC 


Sal I 


SP118A 


NO 


:433 


ACTGGTCGACTTGTCAACAACAACATGCTACTTCTGAG 


Sal I 


SP118B 


NO 


:434 


GACTCTGCAGAAGTTTAACCCACTTATCATTATCC 


Psc I 


SP119A 


NO 


: 435 


AC TGGG ATCC TTGTTC AGGC AAG TC CGTG AC TAG TG AAC 


Bam HI 


SP119B 


NO:436 


GACTAAGCTTGGCTAATTCCTTCAAAGTTTGCA 


Hind III 


SP120A 


NO 


:437 


AGTCGGATCCCTCGCAAATTGAAAAGGCGGCAGTTAGCC 


Bam HI 


SP120B 


NO 


:438 


G ACT AAGCTTGT AAATAAGCGT AC CTTTTTC TTCC 


Hind III 


SP121A 


NO 


-.439 


TCAGGGATCCTTGTCAGTCAGGTTCTAATGGTTCTCAG 


Bam HI 


SP121B 


NO 


:440 


AGTCAAGCTTGGCATTGGCGTCGCCGTCCTTC 


Hind III 


SP122A 


NO 


:441 


G ACTGG ATCCGG AAACTT C AC AGG ATT TT AAAG AG AAG 


Bam HI 


SP122B 


NO 


:442 


GACTGTCGACAATCAATCCTTCTTCTGCACTTCT 


Sai I 


SP123A 


NO 


:443 


CAGTGGATCCTGTGGTCGAAGTTGAGACTCCTCAATC 


Bam HI 


SP123B 


NO 


:444 


GAC T AAGCTTTTC TTC AAATTT ATT ATC AGC 


Hind III 


SP124A 


NO 


:445 


AGTC GG ATC C AAC AC CTGT AT AT AAAGTT AC AGC AATC G 


Bam HI 


SP124B 


NO 


:446 


GACTGTCGACTACTTGACCGAATGCGTCGAATGTACG 


Sai I 
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S. 



Table 3 

pneumoniae ORP Cloning Primers 



Primer 

Wane 



SP125A 
SP125B 
SP126A 
SP126B 
SP127A 
SP1273 



sgo ip Sequence 

NO : 447 CTGAGGATCCATTAGACAGATTAATTGAAATCGG 
GACTGTCGACTTTAAAGATTGAAGTTTTAAAGCT 
TGACGGATCCTAAGACAGATGAACGGAGCAAGGTG 
CTGAAAGCTTTAAGGCTTCCTCAATGAGTTTGTCT 
GACTGGATCCCTGTGAGAATCAAGCTACACCCA 
CTGAAAGCTTTTGTAACTGAGATTGATCTGGGAG 



NO: 448 
NO: 449 
NO: 450 
NO:45I 
NO: 452 



Bam HI 
Sal I 
Bam HI 
Hind III 
Bam HI 
Hind III 
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INDICATIONS RELATING TO A DEPOSITED MICROORGANISM 
(PCTRule \lbis) 

A. The indications made below relate to me microorganism referred to in the oescnption 

on page _J| ■ i'"e 12 

B. IDENTIFICATION OF DEPOSIT Further deposits are identified on an additional sheet 

Name of depositary institution 

American Type Culture Collection 



Address of depositary institution (including postal code ana country/ 

12301 Parklawn Drive 
Rockville, Maryland 20852 
United States of America 



Date of deposit ^ ^ j Accession Numoer 

October 10, 1996 j 55340 



C. ADDITIONAL INDICATIONS (leave blank if not applicable) This information is continued on an additional she:: Q 



In respect of those designations in which a European Patent is sought a sample 
of the deposited microorganism will be made available until the publication' 
the mention of the grant of the European patent or until the date on which 
application has been refused or withdrawn or is deemed to be withdrawn, only 
by the issue of such a sample to an expert nominated bv the person requesting 
the sample (Rule 28(A) EPC) . 

D. DESIGNATED STATES FOR WHICH INDICATIONS ARE MADE (if the indications are not for all designated States) 



E. SEPARATE FURNISHING OF INDICATIONS (leave blank if not applicable) 



The inoications listed beiow will be suominea tome international Bureau \zter tspeantne general nature ojthsiwicanons e.g.. "Accession 
Number of Deposit") 




For International Bureau use oni> 



P~| This sheet was received by the International Bureau on: 



Aumorizea oincs: 



i : urm PCT/RO-'1 34 (July 1992) 
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SINGAPORE 



The applicant hereby requests that the furnishing of a sample of a microorganism shall only b 
made available to an expert. The request to this effect must be filed by the applicant with the 
International Bureau before the completion of the technical preparations for international 
publication of the application. 



NORWAY 

The applicant hereby requests that, until the application has been laid open to public inspection 
(by the Norwegian Patent Office), or has been finally decided upon by the Norwegian Patent 
Office without having been laid open to public inspection, the furnishing of a sample shall only 
be effected to an expert in the art. The request to this effect shall be filed by the applicant with 
the Norwegian Patent Office not later than at the time when the application is made available to 
the public under Sections 22 and 33(3) of the Norwegians Patents Act. If such a request has be, 
filed by the applicant, any request made by a third party for the furnishing of a sample shall 
indicate the expert to be used. That expert may be any person entered on a list of recognized 
experts drawn up by the Norwegian Patent Office or any person approved by the applicant in th 
individual case. 



AUSTRALIA 

The applicant hereby gives notice that the furnishing of a sample of a microorganism : 
be effected prior to the grant of a patent, or prior to the lapsing, refusal or withdrawal 
application, to a person who is a skilled addressee without an interest in the invention 
(Regulation 3.25(3) of the Australian Patents Regulations). 



FINLAND 

The applicant herebv requests that, until the application has been laid open to public inspection 
(by the National Board of Patents and Registration), or has been finally decided upon by the 
National Board of Patents and Registration without having been laid open to public inspection, 
the furnishing of a sample shall only be effected to an expert in the an. 



ICELAND 

The applicant herebv requests that, until the application has been laid open to public inspection 
(by the Icelandic Patent Office), or has been finally decided upon by the Icelandic Patent Office 
without having been laid open to public inspection, the furnishing of a sample shall only be 
effected in the art. 
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DENMARK 



The applicant hereby requests that, until the application has been laid open to public inspection 
(by the Danish Patent Office), or has been finally decided upon by the Danish Patent Office 
without having been laid open to public inspection, the furnishing of a sample shall only be 
effected to an expert in the art. The request to this effect shall be filed by the applicant with the 
Danish Patent Office not later than at the time when the application is made available to the 
public under Sections 22 and 33(3) of the Danish Patents Act. If such a request has been filed by 
the applicant, any request made by a third party for the furnishing of a sample shall indicate the 
expert to be used. That expert may be any person entered on a list of recognized experts drawn 
up by the Danish Patent Office or any person approved by .the applicant in the individual case. 



SWEDEN 

The applicant hereby requests that, until the application has been laid open to public inspection 
(by the Swedish Patent Office), or has been finally decided upon by the Swedish Patent Office 
without having been laid open to public inspection, the furnishing of a sample shall only be 
effected to an expert in the an. The request to this effect shall be filed by the applicant with the 
International Bureau before the expiration of 16 months from the priority date (preferably on the 
Form PUT/RO/134 reproduced in annex Z of Volume I of the PCT Applicant's Guide). If such a 
request has been filed by the applicant, any request has been filed by the applicant, any request 
made by a third parry for the furnishing of a sample shall indicate the expert to be used. That 
expert may be any person entered on a list of recognized experts drawn up by the Swedish Patent 
Office or any person approved by the applicant in the individual case. 



UNITED KINGDOM 

The applicant hereby requests that the furnishing of a sample of a microorganism shall only t 
made available to an expert. The request to this effect must be filed by the applicant with the 
International Bureau before the completion of the technical preparations for the International 
publication of the application. 



NETHERLANDS 

The applicant hereby requests that until the date of a grant of a Netherlands patent or until the 
date on which the application is refused or withdrawn or lapse, the microorganism shall be ma 
available as provided in Rule 3 1F(1) of the Patent Rules only by the issue of a sample to an 
expert. The request to this effect must be furnished by the applicant with the Netherlands 
Industrial Property Office before the date on which the application is made available to the 
public under Section 22C or Section 25 of the Patents Act of the Kingdom of the Netherlands, 
whichever two dates occurs earlier. 
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What Is Claimed Is: 

1 . An isolated nucleic acid molecule comprising a polynucleotide 
having a nucleotide sequence at least 95% identical to a sequence selected from 
the group consisting of: 

(a) a nucleotide sequence encoding any of the amino acid 
sequences of the polypeptides shown in Table 1; or 

(b) a nucleotide sequence complementary to any of the nucleotide 
sequences in (a). 

2. An isolated nucleic acid molecule comprising a polynucleotide 
which hybridizes under stringent hybridization conditions to a polynucleotide 
having a nucleotide sequence identical to a nucleotide sequence in (a) or (b) of 
claim 1 wherein said polynucleotide which hybridizes does not hybridize under 
stringent hybridization conditions to a polynucleotide having a nucleotide 
sequence consisting of only A residues or of only T residues. 

3. An isolated nucleic acid molecule comprising a polynucleotide 
which encodes the amino acid sequence of an epitope-bearing portion of a 
polypeptide having an amino acid sequence in (a) of claim 1. 

4. The isolated nucleic acid molecule of claim 3, wherein said 
epitope-bearing portion of a polypeptide has an amino acid sequence listed in 
Table 2. 

5 . A method for making a recombinant vector comprising inserting 
an isolated nucleic acid molecule of claim 1 into a vector. 

6 . A recombinant vector produced by the method of claim 5. 

7. A method of making a recombinant host cell comprising 
introducing the recombinant vector of claim 6 into a host cell. 

8 . A recombinant host cell produced by the method of claim 7. 

9. A method of producing a polypeptide encoded by the nucleic 
acid molecule of claim 1 comprising culturing the host cell of claim 8 under 
conditions favoring expressing the heterologous polypeptide. 
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10. A polypeptide produced according to the method of claim 9. 

11. An isolated polypeptide comprising an amino acid sequence at 
least 70% identical to a sequence selected from the group consisting of an amino 
acid sequence of any of the polypeptides described in Table 1. 

12. An isolated polypeptide antigen comprising an amino acid 
sequence of an S. pneumoniae epitope shown in Table 2. 

13. An isolated nucleic acid molecule comprising a polynucleotide 
with a nucleotide sequence encoding a polypeptide of claim 9. 

14. An isolated antibody that binds specifically to a polypeptide of 
15 claim 11. 

15. A hybridoma which produces an antibody according to claim 14. 

16. A vaccine, comprising: 

20 (1) one of more S. pnuemoniae polypeptides selected from the 

group consisting of a polypeptide comprising an amino acid sequence identified 
in Table K or a fragment thereof; and 

(2) a pharmaceutically acceptable diluent, carrier, or excipient; 
wherein said polypeptide is present, in an amount effective to elicit protective 

25 antibodies in an animal to a member of the Streptococcus genus. 

17. A method of preventing or attenuating an infection caused by a 
member of the Streptococcus genus in an animal, comprising administering to 
said animal a polypeptide of claim IK wherein said polypeptide is administered 

30 in an amount effective to prevent or attenuate said infection. 

18. A method of detecting Streptococcus nucleic acids in a biological 
sample obtained from an animal involving assaying for one or more nucleic acid 
sequences encoding Streptococcus polypeptides in a sample comprising: 

35 (a) contacting the sample with one or more of the above-described 

nucleic acid probes, under conditions such that hybridization occurs, and 

(b) detecting hybridization of said one or more probes to the one or 
more Streptococcus nucleic acid sequences present in the biological sample. 
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19. A method of detecting Streptococcus nucleic acids in a biological 
sample obtained from an animal, comprising: 

(a) amplifying one or more Streptococcus nucleic acid sequences in said 
sample using polymerase chain reaction, and 

(b) detecting said amplified Streptococcus nucleic acid. 

20. A kit for detecting Streptococcus antibodies in a biological 
sample obtained from an animal, comprising 

(a) a polypeptide of claim 12 attached to a solid support; and 

(b) detecting means. 

21. A method of detecting Streptococcus antibodies in a biological 
sample obtained from an animal, comprising 

(a) contacting the sample with a polypeptide of claim 12; and 

(b) detecting antibody-antigen complexes. 



9818930A2LL> 



WORLD INTELLECTUAL PROPERTY ORGANIZATION 
International Bureau 




PCT 

INTERNATIONAL APPLICATION PUBLISHED UNDER THE PATENT COOPERATION TREATY (PCT) 



(51) International Patent Classification 6 : 




(11) International Publication Number: 


WO 98A8930 


C12N 15/31, 5/18, 1/21, C07K 14/315, 


A3 




C12Q 1/68, A61K 39/09, G01N 33/569, 




(43) International Publication Date: 


7 May 1998 (07.05.98) 


33/68 









(21) International Application Number: PCT/US97/ 19422 

(22) International Filing Date: 30 October 1997 (30. 10.97) 



(30) Priority Data: 

60/029,960 



31 October 1996 (31.10.96) 



US 



(71) Applicant (for all designated States except US): HUMAN 

GENOME SCIENCES. INC. [US/US]; 9410 Key West 
Avenue, Rockville, MD 20850 (US). 

(72) Inventors; and 

(75) Inventors/Applicants (for US only): KUNSCH, Charles, A. 
[US/US]; 2398B Dunwoody Crossing, Atlanta, GA 30338 
(US). CHOI, Gil, H. [KR/US]; 11429 Potomac Oaks 
Drive, Rockville, MD 20850 (US). JOHNSON, L., Sydnor 
[US/US]; 13545 Ambassador Drive, Germantown, MD 
20874 (US). HROMOCKYJ, Alex [US/US]; 10003 Sidney 
Road, Silver Spring, MD 20901 (US). 

(74) Af»cnts: BROOKES, A., Anders et al.; Human Genome 
Sciences, Inc., 9410 Key West Avenue, Rockville, MD 
20850 (US). 



(81) Designated States: AL, AM, AT, AU, AZ, BA, BB, BG, BR, 
BY, CA, CH, CN, CU, CZ. DE. DK, EE, ES, FI, GB, GE, 
GH, HU, ID, IL, IS, JP, KE, KG. KP. KR, KZ, LC, LK, 
LR. LS, LT, LU, LV, MD, MG, MK, MN, MW, MX, NO, 
NZ, PL, PT, RO, RU, SD, SE, SG. SI. SK, SL, TJ, TM, TR, 
TT, UA, UG, US, UZ, VN, YU, ZW, ARIPO patent (GH, 
KE, LS, MW, SD, SZ, UG, ZW), Eurasian patent (AM, AZ, 
BY, KG, KZ, MD, RU, TJ, TM), European patent (AT, BE, 
CH, DE, DK, ES. FI, FR, GB, GR, IE, IT, LU, MC, NL. 
PT, SE), OAPI patent (BF, BJ, CF, CG, CI, CM, GA, GN, 
ML, MR, NE, SN, TD, TG). 



Published 

With international search report. 
Before the expiration of the time limit for amending the claims 
and to be republished in the event of the receipt of amendments. 

(88) Date of publication of the international search report: 

8 October 1998 (08.10.98) 



(54) Title: STREPTOCOCCUS PNEUMONIAE ANTIGENS AND VACCINES 



(57) Abstract 

The present invention relates to novel vaccines for the prevention or attenuation of infection by Streptococcus pneumoniae. The 
invention further relates to isolated nucleic acid molecules encoding antigenic polypeptides of Streptococcus pneumoniae. Antigenic 
polypeptides are also provided, as are vectors, host cells and recombinant methods for producing the same. The invention additionally 
relates to diagnostic methods for detecting Streptococcus nucleic acids, polypeptides and antibodies in a biological sample. 



BNSOOCID; <WO 9618330A3_I_> 



FOR THE PURPOSES OF INFORMATION ONLY 
Codes used to identify States party to the PCT on the front pages of pamphlets publishing international applications under the PCT. 



AL 


Albania 


ES 


Spam 


LS 


Lesotho 


SI 


Slovenia 


AM 


Armenia 


Ft 


Finland 


LT 


Lithuania 


SK 


Slovakia 


AT 


Austria 


FR 


France 


LI) 


Luxembourg 


SN 


Senegal 


AU 


Australia 


GA 


Gabon 


LV 


Latvia 


sz 


Swaziland 


AZ 


Azerbaijan 


GB 


United Kingdom 


MC 


Monaco 


TD 


Chad 


BA 


Bosnia and Herzegovina 


GE 


Georgia 


MD 


Republic of Moldova 


TG 


Togo 


BB 


Barbados 


GH 


Ghana 


MG 


Madagascar 


TJ 


Tajikistan 


BE 


Betgium 


GN 


Guinea 


MK 


The former Yugoslav 


TM 


Turkmenistan 


BF 


Burkina Faso 


GR 


Greece 




Republic of Macedonia 


TR 


Turkey 


BG 


Bulgaria 


HU 


Hungary 


ML 


Mali 


TT 


Trinidad and Tobago 


BJ 


Benin 


IE 


Ireland 


MN 


Mongolia 


UA 


Ukraine 


BR 


Brazil 


IL 


Israel 


MR 


Mauritania 


UG 


Uganda 


BY 


Belarus 


IS 


Iceland 


MW 


Malawi 


US 


United States of America 


CA 


Canada 


IT 


Italy 


MX 


Mexico 


UZ 


Uzbekistan 


CF 


Central African Republic 


JP 


Japan 


NE 


Niger 


VN 


Viet Nam 


CC 


Congo 


KE 


Kenya 


NL 


Netherlands 


YU 


Yugoslavia 


CH 


Switzerland 


KG 


Kyrgyzsian 


NO 


Norway 


zw 


Zimbabwe 


CI 


Cote d'Woire 


KP 


Democratic People's 


NZ 


New Zealand 






CM 


Cameroon 




Republic of Korea 


PL 


Poland 






CN 


China 


KR 


Republic of Korea 


PT 


Portugal 






CU 


Cuba 


KZ 


Kazakstan 


RO 


Romania 






cz 


Czech Republic 


LC 


Saint Lucia 


RU 


Russian Federation 






DE 


Germany 


LI 


Liechtenstein 


SD 


Sudan 






DK 


Denmark 


LK 


Sh Lanka 


SE 


Sweden 






EE 


Estonia 


LR 


Liberia 


SG 


Singapore 







BNSOOCID; <WO 981B330A3_I_> 



INTERNATIONAL SEARCH REPORT 



Mam V Application No 

PCT/US 97/19422 



A. CLASSIFICATION OF SUBJECT MATTER 

IPC 6 C12N15/31 C12N5/18 C12H1/21 C07K14/315 C12Q1/68 
A61K39/G9 GQ1N33/569 G01N33/68 

A uuunfen g to totentebbnai Patent Ctaaaifoation (IPO or to both national otesalRoation and IPC 



B. FIELDS SEARCHED 



Minimum oooumamason aaamhad (olasaifioabon system f obo wad by ^nt tif mt h> n syrnbots) 

IPC 6 C12N C07K C12Q A61K G01N 



Dooumantanon aaarohad othar than minimum doc uj iw i taljon to tha axtent that such dooumante am tnciudad si ttw ftalds ssarahad 



Etaotronio data baaa oonauKad during tha intamationai aaarah (nam* of data baaa and, wham pmotioai, aaareh terms uaad) 



C. DOCUMENTS CONSIDERED TO BE RELEVANT 



Category * Citation of dooumant, with incfacatxm, wham appropriata, of tha ratevant 



RaJavant to olaim No. 



WO 95 06732 A (UNIV ROCKEFELLER ;MASURE H 

ROBERT (US) ; PEARCE BARBARA J (US); TUO) 9 

March 1995 

SEQ 10 nos. 3 and 4 

see claims 1-52 

C. MARTIN ET AL.: "Relateness of 
penicillin-binding protein la genes from 
different clones of penicillin-resistant 
Streptococcus pneumoniae isolated in South 
Africa and Spain" 
EMBO J. , 

vol. 11, no. 11, November 1992, OXFORD 

UNIVERSITY PRESS, GB;, 

pages 3831-3836, XP002060148 

see the whole document 



1-21 



1-15 



LH 



Furthar dooumantB ara kstad in tha continuation of box C. 



Potant family mambam ara bat ad in anrwx. 



* Spaoial oatogonas of cut ad dooumanta : 

*A* dooumant dafinmg tha oonaraJ stata of tha art whioh is net 
oonaidarad to bs of parboutar ratavanoa 

*E' aartiar dooumant but pubtishad on or aftar tha tntamationaJ 
filing data 

"L" dooumant whioh may throw doubts on priority otaim(s) or 
whioh is ottad to astabiish tha publication data of anothar 
Oftabon or othar spaoial raason (as spaorhad) 

"O" dooumant rafarring to an oral dtsolosura, usa t axhbition or 



"F later dooumant pubtishad aftar tha intamabonal filing data 
or priority data and not in oonftiot with tha application but 
oftad to urtdaritand tha prmotpJa or thaory undortying tha 



*P* dooumant pubfishad prior to tha intamationai fling date but 
later than tha priority data eiaimad 



*X* dooumant of particular ralavanoa ; tha olaimsd invention 
oannot ba oonaidarad noval or oartnot ba oonsidanid to 
invowa an invantiva step whan tha dooumant is taksn a tons 

"Y" dooumant of partioular raisvanoa ; tha olaimsd invanbon 

oannot ba oonaidarad to invoVa an mvartuva stap whan tha 
dooumant is oombinad with ona or mora othar suoh doou- 
mama, suoh combination batng obvious to a parson skillad 
in tha art 

"a* dooumant ma m ba r of tha sama patent famih/ 



Date of tha aotuai oomplobbn of tha intamationai saaroh 

6 May 1998 


Date of mailing of tha international saaroh raport 

1 B. 08. 199B 


Nama and mailing addraas of tha ISA 

Europaan Patent Omoa, P.B. 5618 Patenttaan 2 
NL - 22B0 HV Rgswi#t 
Tal. (431-70) 340-2040, Tx. 31 651 apo nt, 
Fax: (+31 -70) 340-3016 


Authonzad omoor 

HORN I G H. 



BNSOOCtD: <WO 9618930A3J_> 



page 1 of 2 



INTERNATIONAL SEARCH REPORT 



tmwn. p Application No 

PCT/US 97/19422 



CMContlnuatton) DOCUMENTS CONSIDERED TO BE RELEVANT 



Category* C*«oan of documom. wth iwtotei^whw ■ppwi(>iial», of tft* mtrvam pwo»» 



Rotowttto olstfn h 



WO 96 16682 A (ASTRA A6 ;BALGANESH TANJORE 
SOUNDARARAJA (IN); TOWN CHRISTINE MARY) 36 
Nay 1996 

SEQ ID nos. 5 and 6 
see claims 1-26 

WO 95 31548 A (UAB RESEARCH FOUNDATION 
;Y0THER JANET (US); DILLARD JOSEPH P (US)) 
23 November 1995 
see the whole document 

WO 95 14712 A (RES CORP TECHNOLOGIES INC) 
1 June 1995 

see the whole document 

WO 96 05859 A (AMERICAN CYANAMID CO) 29 
February 1996 
see abstract 

WO 93 10238 A (US HEALTH) 27 May 1993 
see the whole document 

EP 6 687 688 A (UNIV OVIEDO ;UNIV 
LEICESTER (GB)) 20 December 1995 
see abstract 

EP 0 622 081 A (UAB RESEARCH FOUNDATION) 2 

November 1994 

see the whole document 

B.J. PEARCE ET AL.: "Genetic 

identification of exported proteins in 

Streptococcus pneumoniae" 

MOLECULAR MICROBIOL., 

vol. 9, no. 5, 1993, BLACKWELL, OXFORD, 

GB. 

pages 1037-1050, XP002O6O149 
see the whole document 



1-15 



1-21 

1-21 

1-21 

1-21 
1-21 

1-21 

1-21 



2 



Fom PCT/ISA/210 (eoranutfxn at Moond (Juty 1992) 
BNSOOCID: <WO 9816930A3J_> P*9^ 2 Of 2 



tnt ibonaJ application No. 



INTERNATIONAL SEARCH REPORT 



PCT/US 97/ 19422 



Box 1 bservations whm certain claims were found unsearchable (Continuation of item 1 of ftrat shoot) 



TWa International Search Report has not been established in ntspect of oartain claims under Arbde 17(2X«) for the Mowing reasons: 



Remark: Although claim 17 Is directed to a method of treatment of the 
human/animal body, the search has been carried out and based on the alleged 
effects of the compound/composition. 



because they relate to parts of the International Application that do not oompry with the prescribed requirements to such 
an extent that no meaningful International Search can be carried out, specifically: 



because they are dependent claims and are not drafted in accordance with the second and third sentences of Rule 6.4(a). 



Box tl Observations where unity of Invention is tacking (Continuation of Hem 2 of first sheet) 



This international Searching Authority found multiple inventions in this international application, as follows: 



see continuation-sheet 



1 . I I Aa all required additional search fees were timely paid by the applicant, this International Search Report covers all 
1— ' searchable claims. 
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3. I 1 Aa only some of the required additional search fees were timely paid by the applicant, this Intern at tonal Search Report 
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4. | X 1 No required adef tional search fees were timely paid by the applicant. Consequently, this International Search Report is 
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FURTHER INFORMATION CONTINUED FROM PCT/tSA/ 210 



1. Claims: (1-21) partially 

An isolated nucleic acid molecule comprising a 
polynucleotide having a nucleotide sequence at least 95% 
identical to a sequence from the group consisting of of: (a) 
a nucleotide sequence SEQ ID no.l encoding the the amino 
acid sequence of the polypeptide SEQ ID no. 2 shown in Table 
1; or (b) a nucleotide sequence complementary to said 
nucleotide sequence in (a); an isolated nucleic acid 
molecule comprising a polynucleotide which hybridizes 
under stringent conditions to a polynucleotide having a 
nucleotide sequence identical to a nucleotide sequence in 
(a) or (b) t wherein said polynucleotide which hybridizes 
does not hybridize under stringent hybridization conditions 
to a polynucleotide having a nucleotide sequence consisting 
of only A or of only T residues; an isolated nucleic acid 
molecule comprising a polynucleotide which encodes the amino 
acid sequence or an epi tope-bearing portion of a polypeptide 
having an amino acid sequence of SEQ ID no. 2 in (a); said 
epi tope-bearing portion of a said polypeptide has an amino 
acid sequence listed in Table 2; a method of making a vector 
using said isolated nucleic acid molecule; said recombinant 
vector; a method of making a recombinant host cell using 
said vector; said recombinant host cell; a method of 
producing said polypeptide; said polypeptide; an isolated 
antibody that binds to said polypeptide; a hybridoma which 
produces said antibody; a vaccine comprising said 
polypeptide selected from SEQ ID no. 2 in Table 1, or a 
fragment therof ; a method of preventing or attenuating an 
infection caused by a member of Streptococcus genus in 
animal using said polypepitde; a method for detecting 
Streptococcus nucleic acid sequences using the 
above-described nucleic acid probe; a kit for detecting 
Streptococcus antibodies in a biological sample using said 
polypeptide sequence; 



2-113. Claims: (1-21) partially 

-Idem as subject 1 but limited to the sequences having SEQ 
ID nos. 3 to 226. (Invention 2 is limited to SEQ ID nos. 3 

and 4; Invention 3 is limited to SEQ ID nos. 5 and 6; 

Invention 113 is limited to SEQ ID nos. 225 and 226). 

For the sake of conciseness, the first group is explicitly 
defined, the other groups are defined by analogy hereto. 
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